This time series dataset includes viral COVID-19 laboratory test [Polymerase chain reaction (PCR)] results from over 1,000 U.S. laboratories and testing locations including commercial and reference laboratories, public health laboratories, hospital laboratories, and other testing locations.
All the text2sql data
Text-to-SQL Generation for Question Answering on Electronic Medical Records
NVBench is a large dataset for complex and cross-domain NL2VIS tasks, which covers 105 domains, supports seven common types of visualizations, and contains 25,750 (NL, VIS) pairs.
Spider is a large-scale complex and cross-domain semantic parsing and text-to-SQL dataset annotated by 11 Yale students.
A large crowd-sourced dataset for developing natural language interfaces for relational databases.
This is a repository of dad jokes meant to be used to fine-tune an LLM to have a sense of humor.