Frequently Asked Questions (FAQs)
Below, you will find answers to common questions regarding reproducibility checks and regarding our Data and Code Availability Policy. Simply click on each question to reveal the corresponding answer.
Scope of the reproducibility checks
The reproducibility checks conducted by the Econometric Society aim to ensure that replication packages are comprehensive and well-documented, allowing other researchers to replicate the paper's results independently of the authors. These checks involve examining the provided documentation, data, and codes to confirm that they reproduce all the results presented in the paper, including tables, figures, and any numerical information not directly derived from the visual elements.
If the data are accessible, either included in the package or temporarily granted to the ES reproducibility team, the checks certify that the code precisely reproduces the published results, including those in the approved online appendices. However, if authors have been granted exemptions and provide simulated or synthetic data instead, it is not possible to verify the exactness of the results. In such cases, the package is assessed to ensure it produces all exhibits in the paper and approved online appendices, even if the results may not match those reported in the publication.
It is important to note that the Econometric Society focuses specifically on reproducibility checks, as opposed to replication checks. Consequently, the checks do not evaluate coding errors, discrepancies between the code's intended functionality and its actual implementation, econometric errors, or whether the paper's empirical approach can be replicated in alternative environments or datasets.
The procedure of implementing reproducibility checks provides several advantages for authors. Firstly, it fosters research transparency, which in turn enhances the credibility of scientific research. By ensuring that the results can be reproduced, authors' integrity is explicitly preserved and disciplined.
Moreover, making replication packages publicly available and fully functional allows independent researchers to replicate the results without relying on the authors. This increased accessibility not only facilitates a comprehensive understanding of the published research but also encourages further studies and the development of new research based on the published results. Hence, it increases the impact of the research.
Additionally, the reproducibility checks often help detect and rectify small errors and typos in the replication packages or the paper/appendices prior to publication. This proactive approach eliminates the need for subsequent errata and ensures the accuracy and reliability of the research findings.
Overall, the implementation of reproducibility checks benefits authors by upholding research integrity, increasing the impact of their work, and enabling error correction before publication.
Absolutely. The reproducibility checks encompass not only all the results in the main paper but also those in approved online appendices, published at the journal's website as supplementary material. The replication package provided by the authors is expected to generate all tables, figures, and in-text numbers found in the paper and its appendices, ensuring that the code accurately reproduces the results. This comprehensive assessment includes verifying the ability of the codes to produce the results presented in both the main paper and any supplementary online appendices.
At the Econometric Society, we firmly uphold reproducibility and replicability as fundamental principles of scientific research. While replication checks are undoubtedly valuable, they often demand considerable time, effort, and resources that journals typically cannot allocate due to the need for a swift publication process that facilitates the advancement of science.
By conducting reproducibility checks, we establish a crucial initial step in ensuring the integrity of published research. Our aim is to guarantee that authors make all relevant data and codes accessible, allowing others to replicate the results presented in their papers. Significantly, these checks ascertain that the provided codes and data execute successfully and yield the published outcomes.
The certification we provide through reproducibility checks bolsters transparency by enabling other researchers to reproduce the published research and examine it within different datasets, assumptions, methods, and more. Moreover, it offers an additional service to authors, as we often identify minor errors that can be rectified prior to publication, sparing the need for subsequent errata.
Data and Code Availability Policy and exemptions
In order to guarantee the availability of data for future replication attempts, even publicly available data must be included in the replication package. The exception to this rule is if the exact extract of the raw data used in the analysis is published in an open-access trusted repository that satisfies the FAIR data principles. In that case, including the permanent identifier (e.g., DOI) linking to these raw data is considered sufficient to fulfill the obligation of including the raw data in the replication package.
Publicly available datasets can undergo updates or removal by the data provider. Therefore, if you solely indicate how to access the data without including it in the replication package, there is a risk that your version of the data may become inaccessible to researchers in the future. By incorporating the datasets within the replication package, you ensure their continued availability and enable other researchers to replicate your results reliably. This practice aligns with the principles of transparency and facilitates the reproducibility of your research findings.
The right to re-publish publicly available data varies among providers, as they have different policies regarding data redistribution (here is an unofficial summary of terms of use of some standard datasets). Some providers allow re-distribution if your extract is deposited in a specific repository. It is essential to ascertain the restrictions on publishing your data before submitting your paper. Additionally, it is important to seek permission from the original data owner and cite the original source appropriately.
Yes, you can request an exemption if your data are restricted-access or you do not have the right to publish them. The request should be made at the time of initial submission. To do so, include a cover letter addressed to the co-editor handling your paper, clearly stating the grounds for the exemption. The co-editor will assess the justification before the paper is sent to referees. If the exemption is not granted, the manuscript will be rejected unless you accept the data and code availability policy. Note that submission fees will not be refunded. If an exemption is required for new data incorporated during the editorial process, it should be requested in the first iteration where the new data are introduced.
Yes, you can request an exemption for specific portions of your data at the time of initial submission.
If my main dataset is available to publish, but there is a small portion of my data that I am not allowed to share, should I request an exemption?
Yes, requesting a data exemption is necessary in such cases. Without the exemption, if you are unable to publish the data, your paper will be rejected for publication.
Yes, as long as the appendix is approved by the co-editor and published online as supplementary material of the paper, an exemption should be requested if any data required to produce the results cannot be shared. Without the exemption, if you are unable to publish the data, your paper will be rejected for publication.
If the data I use are publicly available to everyone, but I do not have permission to re-publish it, should I request a data exemption?
Yes. Without the exemption, if you are unable to publish the data, your paper will be rejected for publication.
Can I request an exemption later than the first initial submission?
In general, exemptions can only be requested for newly incorporated data during the editorial process. If an exemption was not requested at the time of initial submission and your data cannot be published, your paper may be rejected for publication.
If my data are free of charge and available to any researcher who requests it from the data provider, but I don't have the right to publish them with the replication package, should I request an exemption?
Yes, requesting an exemption is necessary when data used for analysis cannot be published with the replication package or in an open-access trusted repository that satisfies the FAIR data principles.
No, you do not need to request an exemption in cases where the (exact extract of the) data are archived in an open-access trusted repository that satisfies the FAIR data principles. As long as the published extract used in the study is available in the exact format required by your code, it is acceptable for the replication package. The Data Editor will evaluate the suitability of the repository.
Data archived in open-access trusted repositories that satisfy the FAIR data principles are acceptable for the replication package as long as the published extract used in the study is included and readily available in the exact format required by the code. The suitability of the repository will be evaluated by the Data Editor to determine whether publishing a copy with the package on the journal's repository is necessary.
Yes, you need to submit a replication package. Personal websites or similar platforms are not considered "trusted" open repositories because there is no guarantee of systematic archiving. Refer to this for guidance on what qualifies as a trusted repository.
No, you cannot request an exemption based on wanting to retain exclusivity rights for future research. The data and code availability policy aims to ensure transparency and reproducibility, which necessitates publishing the data you collected. Increased visibility and accessibility of your data contribute to the advancement of research.
Yes, you can request a data exemption for commercial data providers. While restricted access data is generally discouraged, if your research heavily relies on a specific dataset that does not have an open alternative, exemptions can be granted. However, you may be required to provide certification from the provider, confirming that the data will be archived and made available to other users following the same access procedure.
In general, exemptions for experimental data cannot be requested. It is important to anonymize the data to ensure the subjects cannot be identified. Only in cases where the nature of the study prevents anonymization, or restrictions imposed by ethics reviews prevents publication of the data, authors can request a data exemption. The exemption will cover only the minimum necessary to maintain the anonymity of the experimental subjects and to fullfill ethical requirements.
In general, no.
Yes. Open source software is encouraged, but licensed software is allowed.
Whenever possible, it is recommended to include these packages or libraries in the replication package. If the packages or libraries are available in open repositories (e.g., most Stata packages), providing clear instructions on how to download and use them, including version, would be sufficient. However, if the libraries cannot be included in the package and are not publicly available, the Data Editor will liaise with the authors to determine a feasible approach for implementing the necessary checks. The aim is to ensure that the replication package is comprehensive and allows for the replication of the research findings.
Procedures when exemptions are granted
If you have been granted an exemption, your paper will still undergo reproducibility checks before final acceptance. To facilitate this process, you have two options. Firstly, you can grant (or help the replication team obtain) temporary access (either remote or physical) to the restricted data exclusively for the purpose of the checks. Once the checks are completed, the data will be destroyed, or access will be terminated. Alternatively, you can provide simulated or synthetic dataset(s) that replicate the characteristics of the original data used in your analysis. Simulated/synthetic datasets are not required when temporary access is granted, but they are recommended, as they allow users to run and test your codes.
A simulated dataset is generated based on a model, ideally the model used in your research. On the other hand, a synthetic dataset involves scrambling or perturbing the actual dataset to ensure anonymity while preserving key characteristics.
Whenever possible, it is strongly recommended to provide (or help the replication team obtain) temporary access to the restricted data. This approach offers several advantages: (i) it eliminates the need to produce synthetic or simulated datasets, saving effort and potential loss of fidelity; (ii) the certification provided by the journal becomes stronger as it verifies the ability to reproduce the published results rather than solely checking the completeness and functionality of the code; and (iii) it allows for the detection and correction of any errors before publication. However, when you provide temporary access to the restricted data, simulated/synthetic datasets are not required when temporary access is granted, but they are recommended, as they allow users to run and test your codes.
The reproducibility team at the Econometric Society treats the restricted data with the utmost ethical standards, ensuring confidentiality and using them solely for running the reproducibility checks. The restricted datasets are destroyed once the checks are performed and will not be published.
What should I do if I am not allowed to provide temporary access to the confidential data, but the data provider can run the code to implement the reproducibility checks?
In such cases, if direct access to the restricted data is not feasible, it is still preferable to facilitate access, as long as the checks can be executed within a reasonable timeframe. You should supply the replication package to the journal along with the contact information of the data provider. The reproducibility team will then coordinate with the data provider to run the code and obtain the necessary output for verification.
The reproducibility team will treat the data with the highest ethical standards, preventing any violations of confidentiality, and using them exclusively to run the reproducibility checks. The restricted datasets will be destroyed as soon as the checks are performed and, therefore, they will not be published.
If involving a certification agency is a possibility, it is generally preferred over only providing simulated/synthetic datasets. However, it is essential to seek approval from the Data Editor before making any commitments with the certification agency. It's important to note that the Econometric Society will not cover the cost of certification in such cases.
If a public use testing sample is available, it is generally preferred over a simulated/synthetic dataset (but less preferred than providing temporary access to the original data) as long as the testing sample can be published with your replication package. If the sample cannot be published, it is advisable to provide a simulated/synthetic dataset that can be included in the package.
In the case of simulated or synthetic datasets, they will be published along with the replication package. Although these datasets may not represent the actual data, their structure is designed to closely mimic the original dataset, providing readers with a better understanding of the data used. You must ensure that the manipulation process used to create the synthetic/simulated datasets is clearly described in the READ;E file.
When reproducibility checks cannot be performed on real data, running these checks on simulated/synthetic datasets still offers advantages. They help verify the completeness, self-contained nature, and error-free execution of the code. For future users of the package, it also allows them to run the codes and learn from them.
In such cases, it is strongly recommended to simulate data using your model as the data generating process. If that is not feasible, it is important to contact the Data Editor and provide a detailed explanation of why this is the case. The Data Editor will assist you in finding a suitable solution and may propose alternative approaches to handle the situation to your managing co-editor.
To generate a dataset that closely resembles the original data, the synthetic option may be easier to implement. There are various open-source routines available that can assist in this process. However, consider two main disadvantages: (i) one needs to ensure proper anonymization of the data when using a scrambling/perturbation algorithm, and (ii) non-linear estimation routines may face convergence challenges when applied to synthetic data, whereas artificial datasets generated by the model being estimated are more likely to converge.
Implementation of the reproducibility checks
Typically, our aim is to provide the outcome of our reproducibility checks within two weeks. However, if the package is incomplete or the code does not run, multiple iterations may be required, which can extend the processing time. Articles that involve lengthy computations may also require additional time. The responsiveness of the authors to our requests also influences the overall processing time.
Please be aware that emails sometimes go to spam. Check your spam folder regularly when you are expecting to receive the outcome from the checks.
The reproducibility checks are conducted by a dedicated team of advanced Ph.D. students supervised by our Data Editor. After an article is conditionally accepted for publication, authors are requested to submit the replication package alongside other production files. The Data Editor assigns the package to one or more members of the reproducibility team. They review the package, run the code, and compile a report summarizing the outcome of the checks. The Data Editor then reviews the report and communicates the decision to the authors, requesting any necessary amendments to the package. Once the reproducibility checks are completed, the article is transferred back to the original Editor for final acceptance. If changes to the conditionally accepted manuscript are required due to the checks, these will need to be reviewed and approved by handling co-editor before acceptance. Full reproducibility is a condition for final acceptance.
Yes, upon submission, the Data Editor assigns the package to one or more members of the reproducibility team, who will run your code and examine the generated output. They compile a report summarizing the results of the checks. However, in cases where the code is computationally demanding and exceeds a reasonable runtime, the Data Editor may contact you with a recommendation to provide a simplified version of the code that focuses on testing the essential components. For example, this can involve reducing the number of replications in a simulation exercise or simplifying optimization routines. The simplified "testing" version will be published alongside the original code in your replication package, promoting transparency and facilitating replication or related research by other scholars. In some instances, the reproducibility checks may be outsourced to authorized third parties.
If your code requires excessive computational resources and/or time, the Data Editor will contact you with a recommendation to supply a simplified version that focuses on essential aspects of the code. This can include reducing the number of replications in a simulation exercise, simplifying structural model-solving procedures, or creating simplified functions to test optimization routines. The simplified "testing" version will be included in your replication package alongside the original code. Providing these testing versions enhances the transparency of your research, increases its visibility, and aids other researchers in understanding and using your code for replication or related purposes.
If the provided data and code fail to reproduce the results presented in the paper, the Data Editor will reach out to you to identify the source of the discrepancy. Once the reproducibility checks are completed, if the the paper or approved online appendices need to be adjusted as a result of the original discrepancy, even if minor, the Data Editor will notify the handling co-editor. The handling co-editor will need to approve these changes in order to grant final acceptance. If the changes significantly alter the paper's message, the original Editor may decide to reject the paper. Full reproducibility is a prerequisite for final acceptance.
If the replication package you submitted is incomplete, the Data Editor will contact you and provide guidance on the amendments and additions needed to pass the reproducibility checks. Once you have made the necessary revisions, the revised package will undergo the checks again.
We require you to resubmit the entire package to avoid any potential mishandling of files and ensure accuracy. Updating the replication package with additional files you submit poses risks that package may be mishandled. Therefore, resubmitting the complete package helps maintain the integrity and proper handling of your files.
Content of the replication package
When feasible, the recommended approach is to provide a physical copy of your data by including it in a separate folder clearly labeled (e.g. "Confidential data not for publication") outside of the replication package. All replicators and the Data Editor have signed confidentiality agreements, ensuring that the data is only used for reproducibility checks. If providing a physical copy is not possible, it is advised to contact the Data Editor to make appropriate arrangements for granting access to the reproducibility team, even when this access must be granted directly by the data provider.
The purpose of submitting a signed checklist is to ensure that all elements of the replication package are included. By completing the checklist, you reduce the likelihood of iterative revisions and expedite the process.
Please refer to our Data and Code Availability Policy and our Prepare the replication package page for details on all items that should be uploaded with your. The use of the Social Sciences Data Editors' Template of README is strongly encouraged. While using the template is not required, all the items listed in the template are required by our Data and Code Availability Policy.
Yes, this is requested by our Data and Code Availability Policy.
PDF format ensures portability, allowing seamless transfer of files without concerns about dependencies, fonts, or other compatibility issues. This guarantees readability across different platforms and for all users.
Including non-proprietary copies of datasets is essential for several reasons. Firstly, it enables users who do not have access to the specific proprietary software used in your study to still access and utilize your data effectively. This enhances the reproducibility and transparency of your research by making the data more accessible and usable to a broader range of researchers.
Secondly, providing datasets in non-proprietary formats minimizes compatibility issues that may arise from software version disparities. For example, older versions of software may not be able to open files saved by newer versions, leading to potential data inaccessibility (e.g. in Stata). By offering data in universally compatible formats such as ASCII or CSV, you ensure that researchers can work with the data using a variety of tools and software environments.
Data citations
It is necessary to cite all datasets used in in the paper and the approved online appendices must be appropriately cited in both the paper/appendices and in a dedicated references section of the README file. As a general guideline, citations of data employed in the paper should be included in the paper's references section, while citations exclusively pertaining to data used in the approved online appendices may be relegated to the appendix. However, in exceptional circumstances, such as when there is a large number of data sources to cite or when recommended by the handling co-editor, citations of data used in the paper may be included in a references section of the approved appendix. The citations included in the references section of the README file should follow the citation formats specified by each journal, ensuring that references can be accurately indexed by bibliometric search engines.
Yes, all datasets used in your paper should be listed in the references section, just like citing other papers. Additionally, a copy of these citations should be included in a dedicated section of the README file.
To cite your data, follow the same citation style used for citing other papers in the references section of your paper. A copy of these citations should also be included in a dedicated section of the README file, except for primary data published for the first time with the replication package, which should only be cited in the paper, but there is no need to self-cite in the README file. Given that your replication package will only have a DOI once it is posted in the Econometric Society Journals' Community at Zenodo, it is recommended to initiate a deposit to book a DOI before submitting the replication package, so that the deposit, with the secured DOI, is completed after reproducibility checks are concluded.
Data citations hold the same level of importance as citations to other papers, if not more. Properly crediting data providers aligns with scientific ethical standards. Moreover, by giving due credit to data providers, you contribute to their ability to secure external funding, ensuring the continued availability of their datasets for research purposes. Acknowledging the contributions of data providers is crucial for promoting transparency, integrity, and the advancement of scientific knowledge.
Reproducibility certification, publication of the replication package and copyright issues
The empirical/simulation/experimental papers that we checked include the following statement:
"The replication packages for this paper are available at [DOI here]. The Journal checked the data and codes included in the package for their ability to reproduce the results in the paper and approved online appendices."
This statement is adjusted accordingly when data exemptions are granted (acknowledging either that the authors provided temporary access to the confidential data or that the checks were implemented on simulated/synthetic data provided by the authors). In particular, we certify one of the following, depending on what is applicable:
"The replication packages for this paper are available at [DOI here]. The authors were granted an exemption to publish their data because either access to the data is restricted or the authors do not have the right to republish them. Therefore, the replication package only includes the codes but not the data. However, the authors provided the Journal with (or assisted the Journal to obtain) temporary access to the data. The Journal checked the restricted data and the provided codes for their ability to reproduce the results in the paper and approved online appendices."
"The replication packages for this paper are available at [DOI here]. The authors were granted an exemption to publish their data because either access to the data is restricted or the authors do not have the right to republish them. However, the authors included in the package a simulated or synthetic dataset that allows running their codes. The Journal checked the synthetic/simulated data and the codes for their ability to generate all tables and figures in the paper and approved online appendices. However, the synthetic/simulated data are not designed to reproduce the same results."
These statements are combined accordingly when more than one situation applies.
The statements are also also adjusted when the nature of the algorithms is highly demanding, and a partial/simplified version of the code has been used for the reproducibility checks. In particular, we add the following sentence at the end of the statement:
"Given the highly demanding nature of the algorithms, the reproducibility checks were run on a simplified version of the code, which is also available in the replication package."
After all reproducibility checks are completed, you will be requested by the Editorial Office to publish your checked package at the Econometric Society Journals' Community at Zenodo. Zenodo will assing your package a Digital Object Identifier (DOI), which then will be linked with your publication.
Yes, one of the main advantages of you publishing the package at the Econometric Society Journals' Community at Zenodo is that you are the sole responsible and copyright owner of the specific publication. Therefore, it is important that you ensure that you have permission to publish your data before the time of first submission and request an exemption then if you don't.
There are many advantages from publishing the package at the Econometric Society Journals' Community at Zenodo. Some of them are: (i) the author retains the copyright on the replication package, (ii) by having a specific DOI, it increases the visibility of all packages and, in turn, it increases the visibility and impact of your article, and (iii) it makes it easier to cite.
Yes, as long as one copy is published at the Econometric Society Journals' Community at Zenodo. The only exception is when your replication package is published in an open-access trusted repository with a permanent DOI. In that case, your DOI can be used to link your article with your package, and the Data Editor can wave the requirement to publish the package at the Econometric Society Journal's community at Zenodo. However, publishing your package at the Econometric Society's community at Zenodo is recommended over other repositories, because it increases the visibility of your package by bundling together with all the replication packages published at the Econometric Society journals.
Each provider offers a different policy regarding re-distribution of original and transformed datasets. Some providers, for example, allow re-distribution as long as your extract is deposited in a specific repository. You should make sure about the restrictions to publish your data before the first submission. You should also make sure to seek permission from the original owner of the data to publish them, and make sure to cite the original source accordingly. You will be the responsible of copyright infringements for what you publish with the replication package at the Econometric Society Journals' Community at Zenodo. Here's an unofficial guidance on the terms of use of commonly used datasets.
Yes. Please address your request to [email protected]