AI research atlas / v2

Learn AI papers in the right order.

Start with landmark ideas, move through foundations, then branch into LLMs, GenAI, agents, systems, and safety with a reading path that keeps the field from feeling random.

Start roadmap My reading

10 learning tracksFull-paper readerChatGPT handoff

Recommended firstLandmark papers

Build the mental timeline before going deep.

Then specializeLLMs, GenAI, safety

Move from foundations to modern systems.

Read modePDF + resources

Path-firstNo more random paper hopping

Research-nativearXiv links, PDFs, resources

Study loopTrack reading and discuss in ChatGPT

Learning path

Where to start, and what to read next

Start with landmarks

Orientation / 1-2 weeks

Start Here

Read the papers everyone keeps referencing so the rest of the map has anchors.

Know the landmark namesBuild historical contextPick a direction

Open papers

Foundations / 2-4 weeks

Classical ML

Learn the statistical and probabilistic ideas that still sit under modern models.

Bayesian thinkingModel evaluationUncertainty

Open papers

Foundations / 1-2 weeks

Optimization

Understand the training mechanics behind gradient-based learning.

Gradient descentGeneralizationTraining stability

Open papers

Builder / 3-5 weeks

Deep Learning Core

Move through representation learning, CNNs, residual networks, and scaling patterns.

CNN intuitionRepresentation learningBenchmark culture

Open papers

Builder / 3-6 weeks

Sequence Models and LLMs

Study attention, transformers, language modeling, instruction tuning, and evaluation.

AttentionPretrainingInstruction following

Open papers

Specialist / 3-6 weeks

Generative AI

Compare GANs, diffusion, autoregressive generation, and modern GenAI workflows.

DiffusionGANsGeneration tradeoffs

Open papers

Specialist / 2-4 weeks

Multimodal and Retrieval

Connect language with images, retrieval, embeddings, and real-world knowledge access.

Vision-languageEmbeddingsRetrieval

Open papers

Specialist / 3-5 weeks

RL and Agents

Learn decision making, feedback, policy learning, and agent-style systems.

PoliciesRewardsExploration

Open papers

Practitioner / 2-4 weeks

Systems and Scaling

Understand the infrastructure and engineering papers behind large-scale training.

Distributed trainingServingEfficiency

Open papers

Practitioner / 2-4 weeks

Safety and Interpretability

Study robustness, alignment, transparency, and how to reason about model behavior.

AlignmentRobustnessInterpretability

Open papers

Research library

Multimodal Learning

Showing papers for this learning path. Open any paper card to read the full paper and related resources.

40 papers shown

unread2008

The qualitative content analysis process

AIM: This paper is a description of inductive and deductive content analysis. BACKGROUND: Content analysis is a method that may be used with either qualitative or quantitative data and in an inductive or deductive way. Qualitative content analysis is commonly used in nursing studies but little has been published on the analysis process and many research books generally only provide a short description of this method. DISCUSSION: When using content analysis, the aim was to build a model to describe the phenomenon in a conceptual form. Both inductive and deductive analysis processes are represented as three main phases: preparation, organizing and reporting. The preparation phase is similar in both approaches. The concepts are derived from the data in inductive content analysis. Deductive content analysis is used when the structure of analysis is operationalized on the basis of previous knowledge. CONCLUSION: Inductive content analysis is used in cases where there are no previous studies dealing with the phenomenon or when it is fragmented. A deductive approach is useful if the general aim was to test a previous theory in a different situation or to compare categories at different time periods.

Learn AI papers in the right order.

Where to start, and what to read next

Start Here

Classical ML

Optimization

Deep Learning Core

Sequence Models and LLMs

Generative AI

Multimodal and Retrieval

RL and Agents

Systems and Scaling

Safety and Interpretability

Architecture

Learning Paradigms

Applications

Trust and Deployment

Multimodal Learning

The qualitative content analysis process

A survey on Image Data Augmentation for Deep Learning

Learning Transferable Visual Models From Natural Language Supervision

Zero-Shot Text-to-Image Generation

Wavelet Attention is all you need in multimodal medical image fusion

Transferable-Guided Attention Is All You Need for Video Domain Adaptation

Re-Attention Is All You Need: Memory-Efficient Scene Text Detection via Re-Attention on Uncertain Regions

RITA: Group Attention is All You Need for Timeseries Analytics

Multimodal Attention Is All You Need

Hierarchical Pre-Training of Vision Encoders with Large Language Models

LLandMark: A Multi-Agent Framework for Landmark-Aware Multimodal Interactive Video Retrieval

Chitrakshara: A Large Multilingual Multimodal Dataset for Indian languages

Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models

Research on the Intelligent Perception System of Robots Integrated with Multimodal AI Technology

Why Text Prevails: Vision May Undermine Multimodal Medical Decision Making

BLIP-FusePPO: A Vision-Language Deep Reinforcement Learning Framework for Lane Keeping in Autonomous Vehicles

Evaluating Open-Source Vision-Language Models for Multimodal Sarcasm Detection

Model Merging to Maintain Language-Only Performance in Developmentally Plausible Multimodal Models

Object Detection with Multimodal Large Vision-Language Models: An In-depth Review

Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models

Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation

How do language models learn facts? Dynamics, curricula and hallucinations

MGPATH: Vision-Language Model with Multi-Granular Prompt Learning for Few-Shot WSI Classification

Pre-trained Vision-Language Models Learn Discoverable Visual Concepts

Semi-Supervised Multimodal Multi-Instance Learning for Aortic Stenosis Diagnosis

Multimodal Adversarial Defense for Vision-Language Models by Leveraging One-To-Many Relationships

CMAL: A Novel Cross-Modal Associative Learning Framework for Vision-Language Pre-Training

Improving Multimodal Large Language Models Using Continual Learning

The Multimodal Universe: Enabling Large-Scale Machine Learning with 100TB of Astronomical Scientific Data

Meta Learning to Bridge Vision and Language Models for Multimodal Few-Shot Learning

Beneath the Surface: Unveiling Harmful Memes with Multimodal Reasoning Distilled from Large Language Models

Large Language Models and Multimodal Retrieval for Visual Word Sense Disambiguation

Explaining Vision and Language through Graphs of Events in Space and Time

Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP

Can Linguistic Knowledge Improve Multimodal Alignment in Vision-Language Pretraining?

A Survey on Multimodal Large Language Models

OphGLM: Training an Ophthalmology Large Language-and-Vision Assistant based on Instructions and Dialogue

GaLeNet: Multimodal Learning for Disaster Prediction, Management and Relief

Information Retrieval from the Digitized Books

Vision-Language Pre-training: Basics, Recent Advances, and Future Trends