The Best AI Tools for Literature Reviews (I Tested Them So You Don’t Have To)
By Andy Stapleton
AI-Powered Literature Reviews: A Comparative Analysis of Six Tools
Key Concepts:
- Literature Review Tools: Software designed to assist in the process of researching and synthesizing existing academic literature.
- AI Detection: The process of identifying whether text was generated by artificial intelligence.
- Exportability: The ability to save and edit the generated output in various file formats (e.g., Word, LaTeX, Markdown).
- Hallucinations (in AI): The generation of factually incorrect or nonsensical information.
- Agentic AI: AI systems capable of autonomous action and decision-making.
- Overleaf: A collaborative, cloud-based LaTeX editor.
- Flexural Endurance: A material property relating to its ability to withstand bending.
1. Testing Methodology & Criteria
The video presents a comparative analysis of six AI-powered literature review tools: SciSpace, Thesis AI, AnswerThis, ChatGPT, Gemini, and Manis AI. The tools were evaluated based on five key criteria:
- Number of References: The quantity of relevant sources cited in the generated review. A higher number was preferred.
- Length: The overall word count of the generated review, aiming for sufficient detail without being overly verbose.
- Readability (out of 5): An assessment of the writing quality, focusing on academic tone, clarity, and avoidance of overly complex or unnatural language.
- Exportability: The ability to save the output in editable formats (Word, LaTeX, Markdown) for further refinement.
- AI Detection: The likelihood of the generated text being flagged as AI-written by detection tools (Originality.ai/ZeroGPT).
All tools were prompted with the same query: “Write an academic literature review on self-healing nano composite transparent electrodes in a formal scholarly tone. Synthesize the key theories, methods, and debates in the field rather than summarizing individual papers. Critically evaluate the areas of agreement, disagreement, note strengths and limitations of existing research. Clearly identify gaps or unresolved questions. Write for a graduate level academic audience and avoid inventing citations. State uncertainty where needed.” This prompt was designed to elicit analytical synthesis rather than simple summarization, and to discourage the generation of fabricated citations ("hallucinations").
2. Tool-Specific Outputs & Performance
2.1 SciSpace:
- References: 28
- Length: Long and detailed.
- Readability: Good, with a strong academic foundation due to its extensive database.
- Exportability: Exportable, but requires payment for certain formats.
- AI Detection: 100% detected as AI-generated.
- Overall: Considered a strong contender, particularly for the volume of references and database quality. The presenter identified it as their current "go-to" for academic AI tools.
2.2 Thesis AI:
- References: 13
- Length: The longest output (23,000 words), described as "girthy" and dense.
- Readability: Highest readability score, exhibiting strong academic language, though occasionally wordy.
- Exportability: Excellent, offering exports to Overleaf, Word, PDF, and Markdown.
- AI Detection: 100% detected as AI-generated.
- Overall: Highly recommended for literature reviews, despite lacking features like tables or diagrams (which the CEO is reportedly working on). The presenter favored it for its balance of references, readability, and export options.
2.3 AnswerThis:
- References: 6 (the fewest of all tools)
- Length: Shortest output.
- Readability: Lowest readability score, using unfamiliar terminology ("flexural endurance") and overly complex phrasing.
- Exportability: Good, with PDF and notebook export options.
- AI Detection: 100% detected as AI-generated.
- Overall: Considered the weakest performer, failing to meet the prompt's requirements for reference quantity and exhibiting poor readability.
2.4 ChatGPT:
- References: 16
- Length: Approximately 5,000 words.
- Readability: Moderate.
- Exportability: Poor; difficulty exporting the output beyond sharing a link.
- AI Detection: 100% detected as AI-generated.
- Overall: Functional but hampered by export limitations, making it less suitable for academic writing requiring extensive editing.
2.5 Gemini:
- References: 36 (the most of all tools)
- Length: Long, comparable to SciSpace.
- Readability: Good.
- Exportability: Exportable to Word, PDF, and Markdown, but requires payment for some options.
- AI Detection: 100% detected as AI-generated.
- Overall: Strong performance, particularly in reference quantity and data synthesis, presented in helpful tables. Free to use for the most part.
2.6 Manis AI:
- References: 19
- Length: Moderate.
- Readability: Good.
- Exportability: Exportable to Word, PDF, and Markdown.
- AI Detection: 100% detected as AI-generated.
- Overall: A solid performer, offering a good balance of features and output quality.
3. Key Findings & Observations
- AI Detection is Universal: All tools generated text that was 100% identified as AI-generated by detection software. This highlights the importance of substantial editing and revision when using these tools for academic work.
- Exportability is Crucial: The ability to export to editable formats (Word, LaTeX) is essential for practical application. Thesis AI and AnswerThis excelled in this area.
- Reference Quantity Varies Significantly: Gemini and SciSpace generated the most references, while AnswerThis produced the fewest.
- Readability Requires Scrutiny: AI-generated text often suffers from overly complex phrasing and unnatural language. Thesis AI demonstrated the best readability, while AnswerThis was the weakest.
- Prompt Engineering Matters: The detailed prompt used in the test was crucial for eliciting analytical synthesis and discouraging hallucinations.
4. Synthesis & Conclusion
The video concludes that SciSpace and Thesis AI are the strongest contenders for AI-assisted literature reviews. SciSpace excels in reference quantity and database access, while Thesis AI offers superior readability, export options, and overall suitability for academic writing. Gemini also performed well, particularly in data synthesis and reference volume.
The presenter emphasizes that these tools should be used as a foundation for research, not a replacement for critical thinking and original writing. Substantial editing and revision are necessary to ensure accuracy, clarity, and avoid plagiarism. Despite the universal detection of AI-generated text, these tools can significantly accelerate the literature review process when used responsibly.
“I really like Thesis AI for literature reviews… this would be my go-to base for something I was working on.” – Presenter, regarding Thesis AI’s overall performance.
Chat with this Video
AI-PoweredHi! I can answer questions about this video "The Best AI Tools for Literature Reviews (I Tested Them So You Don’t Have To)". What would you like to know?