3 gorilla image

8/12/2023

This approach aims to teach the LLM to parse and answer questions based on the provided documentation. Gorilla introduces the notion of retriever-aware training, where the instruction-tuned dataset includes an additional field with retrieved API documentation for reference. This initiative ensures that every API goes beyond mere representation, enabling more robust utilization and analysis. These instructions serve as indispensable guides for both training and evaluation purposes. Each API in the dataset is accompanied by a set of 10 meticulously crafted and uniquely tailored instructions. To amplify the value and usability of the dataset, an additional endeavor has been undertaken. Lastly, HuggingFace leads the pack with a whopping 925 APIs, making it the most comprehensive domain. In comparison, Tensor Hub takes it a step further with an extensive collection of 696 APIs. Torch Hub, for instance, offers 95 APIs, providing a solid foundation. Each domain contributes a wealth of information, shedding light on the diverse nature of the dataset. The tech-focused dataset at hand encompasses three distinct domains: Torch Hub, Tensor Hub, and HuggingFace. For example, a prompt may require invoking an image classification model with specific parameter size and accuracy constraints. These challenges highlight the need for LLMs to understand not only the functional description of an API call but also reason about the embedded constraints. The fine-tuning process involves converting the data to a user-agent chat-style conversation format and performing standard instruction finetuning on the base LLaMA-7B model.ĪPI calls often come with constraints, adding complexity to the LLM’s comprehension and categorization of the calls. Using self-instruct, they generate pairs of instructions and corresponding APIs. The authors construct a large corpus of APIs, called APIBench, by scraping machine learning APIs from major model hubs such as TorchHub, TensorHub, and HuggingFace. Gorilla relies on self-instruct fine-tuning and retrieval techniques to enable LLMs to select accurately from a large and evolving set of tools expressed through their APIs and documentation. Recently, researchers from UC Berkeley and Microsoft unveiled Gorilla, a LLaMA-7B model designed specifically for API calls. Tasks such as booking vacations or hosting conferences could be as simple as conversing with an LLM that has access to flight, car rental, hotel, catering, and entertainment web APIs.

This transition from a limited set of hand-coded tools to accessing a vast array of cloud APIs has the potential to transform LLMs into the primary interface for computing infrastructure and the web. Leading LLM providers have begun integrating plugins that allow LLMs to invoke external tools through APIs. By granting access to extensive and dynamic knowledge bases and enabling complex computational tasks, LLMs can leverage search technologies, databases, and computational tools. To overcome these limitations, researchers have started empowering LLMs with tools.

Additionally, as the world evolves, LLMs need retraining to update their knowledge and reasoning abilities. Their ability to store information is constrained by fixed weights, and their computation capabilities are limited to a static graph and narrow context. However, LLMs still face inherent limitations. Recent advancements in large language models (LLMs) have revolutionized the field, equipping them with new capabilities like natural dialogue, mathematical reasoning, and program synthesis. Please give it a try by subscribing below: The goal is to keep you up to date with machine learning projects, research papers, and concepts. TheSequence is a no-BS (meaning no hype, no news, etc) ML-oriented newsletter that takes 5 minutes to read. I recently started an AI-focused educational newsletter, that already has over 160,000 subscribers.

0 Comments

3 gorilla image

Leave a Reply.

Author

Archives

Categories