Recommender systems research is highly empirical, but many published claims remain difficult to interpret, compare, or generalize because evaluation practices vary substantially across papers. Choices concerning task definition, datasets, data partitioning, baselines, metrics, and statistical testing can all influence the conclusions of a study, yet the community still lacks broadly shared standards for experimental rigor and methodological reporting. This workshop aims to create a venue for discussing and advancing stronger research assessment practices in recommender systems. In particular, we seek both to encourage more rigorous evaluation of algorithmic papers and to recognize experimental methodology itself as a contribution to the field. The workshop features two tracks. The Research Papers track welcomes papers proposing new algorithms, models, or methods, and uses a methodology-first, results-blind review process. The Experimental Methodologies track welcomes submissions proposing, discussing, or critiquing evaluation methodologies, protocol designs, and methodological standards, with the goal of supporting discussion during the workshop and informing longer-term community efforts.
Important Dates
- Paper submission deadline: July 20, 2026
- Author notification: August 14, 2026
Track 1: Research Papers
This track is intended for regular research papers, with a particular emphasis on experimental rigor. We welcome submissions that introduce new recommendation methods, industry-meaningful variants, new formulations of recommendation problems, as well as papers presenting modifications to existing algorithms, provided that the contribution is technically sound, well-motivated, and evaluated through an appropriate and well-documented experimental methodology. Papers should either present genuine novelty, even if they do not necessarily outperform the current state-of-the-art, or, if the contribution is more incremental, they must show improvement over existing methods. In all cases, the strength of the evaluation design is essential. Because the workshop focuses on experimental rigor, submissions to the Research Papers track should:- clearly specify the target task and the underlying problem assumptions;
- motivate the choice of datasets and any preprocessing decisions, the data partitioning strategy, the measures taken to avoid leakage, the choice of baselines and why they are appropriate;
- provide a detailed description of the hyperparameter optimization process, and the evaluation metrics adopted.
Review Process for Research Papers
Research papers will be peer-reviewed by members of the program committee in two stages using a methodology-first, results-agnostic reviewing process. At the first stage, reviewers will evaluate the paper’s technical and theoretical soundness, and the evaluation methodology, without taking into account the study results. Reviewers will assess whether the proposed method constitutes a well-motivated contribution, whether its design choices are adequately justified, and whether the experimental methodology is described in detail and it is appropriate to support the claims of the paper. Reviewers will also assess the completeness and quality of artifacts like source code and datasets. Submissions lacking a rigorous evaluation framework or presenting artifacts that are not aligned with the stated evaluation methodology will be rejected during this review phase. At the second stage, the manuscripts that passed the first stage will be reviewed. At this stage, reviewers will consider the reported results, their interpretation, and the reproducibility of the work, primarily as a consistency and completeness check. Papers that successfully pass the first-stage methodological review are expected to be accepted in most cases, provided that the full manuscript is consistent with the reviewed methodology and does not reveal major issues. The final decision will therefore not depend on whether the results are positive or outperform a baseline. Negative results, failed hypotheses, or outcomes that do not improve over existing methods will not constitute grounds for rejection, as long as the problem addressed is important, the proposed method is technically sound and well-motivated, and the evaluation methodology is rigorous and informative. Conversely, papers reporting positive results obtained with a methodology that is not deemed reliable enough will be rejected in the first phase without the reviewers ever having access to the results. Authors of Research Papers are asked to submit two versions of their manuscript:- Methodology version (restricted): This version must not report anything related to the results of the study. At this stage, manuscripts will be evaluated based on the importance of the problem addressed, the soundness of the proposed method, and the quality of the methodology. Manuscripts may include an introduction, related work, a description of the proposed methodology, the task definition, the datasets used, and the planned evaluation protocol, with all the relevant details. However, there should be no section reporting results or discussing outcomes. Authors should also remove any mention of results from the included sections, such as the abstract and introduction. Authors may use a note like "[result-blind]" to highlight that parts of the text have been removed to ensure this version is result-blind. Authors are also strongly encouraged to provide source code (including code for the baseline and code to partition datasets), datasets, configuration details, and any other reproducibility material.
- Experimental version (complete): The complete manuscript containing all sections of the paper, including results and their discussion.
Submission Format
Research Papers should be submitted in PDF format via T.B.D. The proceedings will be published on CEUR-WS, submissions should follow the CEUR-WS single-column format. The length should be commensurate to the contribution, the page limit for the complete version is 8 pages + references.Presentation
At least one author of each accepted Research Paper must attend the workshop and present the paper in person. Presentation is mandatory for acceptance in this track.Track 2: Experimental Methodologies
The Experimental Methodologies track is intended for submissions that propose, discuss, critique, or systematize evaluation methodologies for algorithmic research papers on recommender systems. In addition to full papers, we also welcome shorter position statements that raise relevant methodological questions, identify gaps, or outline promising directions for discussion at the workshop. The goal of this track is to foster contributions that can help the community move toward more rigorous, transparent, and comparable empirical research practices. In particular, we welcome submissions that aim to identify and motivate best practices that researchers can follow when designing and conducting an experimental study in recommender systems. We are especially interested in contributions that can serve as a useful methodological reference for future work, helping authors design experiments that are technically sound, clearly justified, and easier to compare across papers. We also welcome contributions that highlight missing resources (datasets, frameworks, etc.) for certain underexplored tasks. Given the exploratory and community-building nature of this debate, submissions to this track may combine descriptive, critical, and normative elements. For example, a submission may define a methodological problem, analyse current practice in the literature, discuss its limitations, and propose one or more rigorous alternatives. Because this discussion is still at an initial stage, we also welcome survey-style contributions and papers that organize, compare, and distil existing methodological practices, and suggest directions for methodological standardization. We are interested both in widely studied recommendation tasks, such as top-N recommendation, sequential recommendation, and session-based recommendation, and in more challenging settings that are less well explored by the research community and are harder to evaluate consistently, such as tasks involving interest drift or other dynamic phenomena. Overall, the goal is not to discuss point-wise methodological aspects, but rather to encourage broader contributions that can provide a complete description of best practices on how to conduct a comparative experiment for a specific task.Examples of Relevant Contributions
As examples, we are particularly interested in submissions that discuss the following:- critical analysis of current practice, supported by a state-of-the-art study, including discussion of known methodological issues already raised in prior work;
- proposal for an evaluation methodology tailored to a specific and relevant task, or a comparison and discussion of multiple possible methodological choices;
- discussion of which datasets are suitable or unsuitable for a target problem, and which baselines should be considered the minimum reasonable set for comparison;
- suggestions on how to improve comparability across papers adopting a given methodology or protocol;
- methodological recommendations for hyperparameter optimization and model selection;
- what should count as a fair comparison, for example in terms of computational budget, hyperparameter optimization effort, hardware resources, or number of optimization trials;
- missing resources needed, such as frameworks, datasets, simulation tools etc.