Xmodel-1.5: An 1B-scale Multilingual LLM

Wang Qun, Liu Yang, Lin Qingquan, Jiang Ling

2024-11-18

Xmodel-1.5: An 1B-scale Multilingual LLM

Summary

This paper introduces GaussianAnything, a new method for generating 3D models from various inputs using a unique approach that allows for detailed and interactive 3D content creation.

What's the problem?

Generating 3D content has become more advanced, but existing methods still struggle with how to handle different input formats, the design of the latent space (the way data is organized in the model), and how to represent the final output. These challenges make it hard to create high-quality and scalable 3D models.

What's the solution?

GaussianAnything addresses these issues by using a framework that incorporates a Variational Autoencoder (VAE) with multi-view RGB-D-N (color and depth) images as input. It features a special latent space that keeps important 3D shape information and uses a cascaded latent diffusion model to separate shape and texture effectively. This method allows for multi-modal input, meaning it can take point clouds, text descriptions, and images to generate 3D models. Additionally, it supports 3D-aware editing, enabling users to modify specific parts of the generated model easily.

Why it matters?

This research is significant because it improves the ability to generate detailed and accurate 3D models from various types of input. By allowing for better control over the generation process and enabling interactive editing, GaussianAnything opens up new possibilities for applications in gaming, virtual reality, design, and other fields where high-quality 3D content is essential.

Abstract

We introduce Xmodel-1.5, a novel 1-billion-parameter multilingual large model pretrained on approximately 2 trillion tokens. The model demonstrates strong performance across several languages, with particularly notable results in Thai, Arabic, and French, alongside its effectiveness in Chinese and English. In addition, we contribute to the research community by releasing a Thai evaluation dataset, which includes hundreds of questions annotated by students from Chulalongkorn University's School of Integrated Innovation. While the results are promising, we acknowledge that there is still room for improvement. We hope this work advances ongoing efforts in multilingual AI research and promotes better cross-linguistic understanding in various natural language processing tasks. Our models and code are publicly available on GitHub at https://github.com/XiaoduoAILab/XmodelLM.

View Paper