This dataset contains metadata for 4,336 papers on arxiv.org between 12/23/2021 and 1/7/2025 that has been classified by three LLMs to determine the subject matter of the paper. The CSV includes the following columns:
- Title
- Abstract
- Authors
- Date of publication
- arxiv category tags
- paper url
The dataset contains category classes as determined by Gemini 1.5 Pro, Llama 3.1 405b, and Qwen 2.5 72b, as well as the "winning" category that appears in at least two classifier columns. The possible categories offered to the models were:
- Mathematical abilities
- Reasoning abilities (non-domain specific)
- Coding
- Model interpretability
- Personality and emotions
- Scientific/medical knowledge
- General knowledge abilities
- Domain-specific knowledge and use cases
- Safety, ethics, bias, and behavioral alignment
- Factuality and hallucinations
- Adaptability and generalization
- Recommendation systems
- Language, semantics, multilingual capabilities and translation
- Art, creativity, and aesthetics
- Model architecture, performance, hardware and efficiency (non-domain specific)
- Multi-modal capabilities (e.g., vision and text combined) (non-domain specific)
- Autonomous agents (non-domain specific)
- Contextualization, knowledge graphs and retrieval augmented generation (non-domain specific)
- Other
Papers that were classified as "Domain-specific knowledge and use cases" were further classified using the same models as belonging to one of the following domains:
- Science',
- Computing and cybersecurity
- Law and politics
- Finance and economics
- Engineering
- Manufacturing
- Medicine
- Education
- Psychology
- Shopping and consumption
- Art and aesthetics
- Transportation
- Other