5th International Workshop on Databases and Machine Learning

in conjunction with ICDE 2026 — May 2026

About

With the increased adoption of machine learning (ML) across applications and disciplines, a strong synergy between the database (DB) systems and ML communities has emerged. Steps involved in ML pipelines—such as data preparation and cleaning, feature engineering, and management of the ML lifecycle—can benefit significantly from advances in data management. For example, managing the ML lifecycle requires mechanisms for modeling, storing, and querying ML artifacts in a robust, scalable, and auditable manner.

More recently, the advent of large language models (LLMs) and Retrieval-Augmented Generation (RAG) has further intensified the need for high-performance data management infrastructures. Modern AI systems increasingly rely on vector databases, efficient vector search, and scalable model serving. At the same time, the rise of multimodal AI introduces demanding requirements for storing and querying images, audio, video, and other complex data types, all while maintaining low latency and high throughput for end users.

In the opposite direction, ML techniques are now explored in core components of database systems, including query optimization, indexing, storage layout, and self-tuning. Long-standing challenges in databases—such as cardinality estimation, operator and plan selection, resource management, and other tasks traditionally handled with extensive human expertise or rigid heuristics—increasingly benefit from learned models and data-driven approaches.

DBML 2026 aims to bring together researchers and practitioners working at this intersection, providing a dedicated forum for DB-inspired and ML-inspired approaches that address challenges in either or both communities. We welcome work that combines the strengths of DB and ML, ranging from foundational techniques and system designs to practical applications and real-world deployments, including ML for scientific data and other data-intensive domains.

Information about previous editions can be found at DBML 2025 DBML 2024, DBML 2023, and DBML 2022.

For questions regarding the workshop, please contact: dbml26@googlegroups.com.

Topics of Interest

ML for Data Management and DBMS

  • Learned data discovery, cleaning, and transformation
  • ML-enabled data exploration and discovery in data lakes and lakehouses
  • Learned database design, configuration, and tuning
  • ML for query optimization, indexing, and storage/layout decisions
  • Natural language interfaces for data (querying, exploration, summarization, assistants)
  • Pretrained, foundation, and LLM-based models for data management
  • Representation learning for data cleaning, preprocessing, and integration
  • Benchmarking and evaluation of ML-enhanced data management and DBMS components

Data Management for ML and AI Systems

  • Data collection, preparation, and governance for ML/LLM/RAG applications
  • Data quality, robustness, provenance, and lineage for ML workflows
  • Systems and storage for efficient training, inference, and model serving
  • Vector databases, indexing, and hybrid query processing for embeddings
  • Management of multimodal data (text, images, audio, video, etc.) for AI applications
  • DB-inspired techniques for modeling, storage, and provenance of ML and AI artifacts

Keynote Speakers

Details on the keynote speakers for DBML 2026 will be announced soon.

Program

The final program and list of accepted papers will be announced soon.

Important Dates

All deadlines are 11:59 PM AoE.

Submission deadline: Jan 22nd, 2026
Author notification: Feb 19th, 2026
Camera-ready version: March 5th, 2026
Workshop day: May 4th, 2026

Submission and Author Guidelines

Submissions should be made electronically via the submission site. Papers must be prepared in accordance with the official IEEE conference templates. Submitted papers must not exceed 6 pages including references. No appendix is allowed. Only electronic submissions in PDF format will be accepted. Submissions will be reviewed in a single-blind manner.

Organisation

Fatemeh Nargesian
Fatemeh Nargesian
University of Rochester
Workshop Chair
Guillaume Lachaud
Guillaume Lachaud
École polytechnique
Workshop Chair
Jiwon Chang
Jiwon Chang
University of Rochester
Publicity Chair

Program Committee

The current program committee members are tentative.

  • Gerardo Vitagliano - MIT CSAIL
  • Roee Shraga - WPI