Eigenpruning is a method that removes singular values from weight matrices in large language models (LLMs) to improve their performance on specific tasks. This approach is inspired by interpretability methods that aim to automatically find subnetworks of a model that can effectively solve a given task.
Introducing ComplexityNet, a framework that leverages fine-tuned smaller models to accurately assess task complexity and allocate tasks to the most appropriate Large Language Model, reducing computational resource usage by 90% while maintaining high code generation accuracy.