时间:2024-02-29|浏览:236
Nvidia、Hugging Face 和 ServiceNow 正在通过 StarCoder2(一个新的开放访问大型语言模型 (LLM) 系列)推动 AI 代码生成的标准。
这些模型现已提供三种不同的规模,已经接受了 600 多种编程语言(包括低资源语言)的培训,以帮助企业加速其开发工作流程中的各种与代码相关的任务。
它们是在开放的 BigCode 项目下开发的,该项目是 ServiceNow 和 Hugging Face 的共同努力,旨在确保负责任地开发和使用大型代码语言模型。
它们根据开放负责任的人工智能许可证 (OpenRAIL) 免版税提供。
“StarCoder2 证明了开放科学合作和负责任的人工智能实践与道德数据供应链的综合力量。
最先进的开放访问模型改进了先前的生成式人工智能性能,以提高开发人员的工作效率,并使开发人员能够平等地享受代码生成人工智能的好处,从而使任何规模的组织都能更轻松地满足其全部业务需求ServiceNow 的 StarCoder2 开发团队负责人兼 BigCode 联合负责人 Harm de Vries 在一份声明中表示。
StarCoder2:三种模型满足三种不同需求
虽然 BigCode 最初的 StarCoder LLM 以一种 15B 参数大小首次亮相,并接受了大约 80 种编程语言的训练,但最新一代的模型超越了它,具有三种不同大小(3B、7B 和 15B)的模型,并接受了 619 种编程语言的训练。
据 BigCode 称,新模型的训练数据(称为 The Stack)比上次使用的数据大七倍多。
更重要的是,BigCode 社区为最新一代使用了新的训练技术,以确保模型能够理解并生成 COBOL、数学和程序源代码讨论等低资源编程语言。
最小的 30 亿参数模型是使用 ServiceNow 的 Fast LLM 框架进行训练的,而 7B 模型是使用 Hugging Face 的 nanotron 框架开发的。
两者都旨在提供高性能的文本到代码和文本到工作流生成,同时需要更少的计算。
Meanwhile, the largest 15 billion-parameter model has been trained and optimized with the end‐to‐end Nvidia NeMo cloud‐native framework and Nvidia TensorRT‐LLM software.
While it remains to be seen how well these models perform in different coding scenarios, the companies did note that the performance of the smallest 3B model alone matched that of the original 15B StarCoder LLM.
Depending on their needs, enterprise teams can use any of these models and fine-tune them further on their organizational data for different use cases. This can be anything from specialized tasks such as application source code generation, workflow generation and text summarization to code completion, advanced code summarization and code snippets retrieval.
The companies emphasized that the models, with their broader and deeper training, provide repository context, enabling accurate and context‐aware predictions. Ultimately, all this paves the way to accelerate development while saving engineers and developers time to focus on more critical tasks.
“Since every software ecosystem has a proprietary programming language, code LLMs can drive breakthroughs in efficiency and innovation in every industry,” Jonathan Cohen, vice president of applied research at Nvidia, said in the press statement.
“Nvidia’s collaboration with ServiceNow and Hugging Face introduces secure, responsibly developed models, and supports broader access to accountable generative AI that we hope will benefit the global community,” he added.
As mentioned earlier, all models in the StarCoder2 family are being made available under the Open RAIL-M license with royalty-free access and use. The supporting code is available on the BigCode project’s GitHub repository. As an alternative, teams can also download and use all three models from Hugging Face.
That said, the 15B model trained by Nvidia is also coming on Nvidia AI Foundation, enabling developers to experiment with them directly from their browser or via an API endpoint.
While StarCoder is not the first entry in the space of AI-driven code generation, the wide variety of options the latest generation of the project brings certainly allows enterprises to take advantage of LLMs in application development while also saving on computing.
Other notable players in this space are OpenAI and Amazon. The former offers Codex, which powers the GitHub co-pilot service, while the latter has its CodeWhisper tool. There’s also strong competition from Replit, which has a few small AI coding models on Hugging Face, and Codenium, which recently nabbed $65 million series B funding at a valuation of $500 million.
热点:克里斯