According to Wikipedia, “data mining is a process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal to extract information (with intelligent methods) from a data set and transform the information into a comprehensible structure for further use. Data mining is the analysis step of the “knowledge discovery in databases” process, or KDD.”
Data mining is a constantly evolving discipline applied in many fields including finance, law, healthcare, marketing, science and engineering, the retail industry, telecommunications, social media, and government. This guide encompasses free, fee based and consultancy related sources to assist info pros, researchers, data analysts, knowledge managers, and CI/BI experts, to effectively identify and apply reliable, value added data within the scope of their respective work products.
Mining Data on the Internet 2020
45 Great Resources for Learning Data Mining Concepts and Techniques
https://www.import.io/post/38-great-resources-for-learning-data-mining-concepts-and-techniques/
50 Data Mining Resources – Tutorials, Techniques and More
https://www.ngdata.com/data-mining-resources/
80legs – Custom Web Crawlers for Crawling and Processing Web Content
https://www.80legs.com/
2020 Directory of Directories
http://www.2020DirectoryOfDirectories.com/
2020 Guide to Finding Experts By Using the Internet
https://www.FindingExperts.info/
2020 Guide to Privacy Resources and Tools
https://www.StealthMode.info/
2020 Guide to Searching the Internet
https://www.SearchingTheInternet.info/
2020 New Economy Resources
https://www.2020NewEconomy.com/
Advanced Analytics – Unstructured Data Mining
https://advisory.kpmg.us/deal-advisory/data-driven-tech/advanced-analytics.html
An Evaluation of Data Mining Methods and Tools
https://www.idi.ntnu.no/~dingsoyr/project/report.html
An Overview of Data Mining in Road Traffic and Accident Analysis
https://www.jcaksrce.org/upload/49121180_vol2i4p6.pdf
ACM SIGKDD: Current Explorations Issue
https://www.kdd.org/explorations/issue.php?issue=current
Analytics, Data mining and Data Science
https://www.kdnuggets.com/
Apache Pig – Platform for Analyzing Large Datasets
https://pig.apache.org/
Applications of Modern Heuristics and Data Mining Techniques
https://www.people.vcu.edu/~mmanic/papers/grads/McCarty_08_MHandAdvancDMTechniqs.pdf
ARTstor – Digital Image Library for Education and Scholarship
https://www.artstor.org/
Benchmarking- Data Mining Benchmarking Association
https://www.dmbenchmarking.com/
Best Data Mining Tools – Reviews, Pricing and Demos
https://www.softwareadvice.com/bi/data-mining-comparison/
BI-DW – Business Intelligence and Data Warehousing Directory
http://www.bi-dw.info/
Big Data Analytics with Oracle Advanced Analytics
https://blogs.oracle.com/datamining/entry/big_data_analytics_with_oracle
Big Oil Goes Mining for Big Data
https://www.technologyreview.com/news/427876/big-oil-goes-mining-for-big-data/
Bot Research 2020
https://www.BotResearch.info/
Business Intelligence Resources 2020
https://www.BIResources.info/
Calculating Costs of a Data Mining System
https://www.eweek.com/c/a/Data-Storage/Calculating-Costs-of-a-DataMining-System
CCSU – Data Mining
https://web.ccsu.edu/datamining/
Center for Automated Learning and Discovery – Machine Learning Department
https://www.ml.cmu.edu/
Cogitum Co-Citer
https://www.cogitum.com/co-tracker-text/more.shtml
Contentmine – Text and Data Mining Open Source Tools
https://contentmine.org/
Copyright Clearance Center
https://www.copyright.com/
COREMINE Medical – Biomedical Mindmap
https://www.coremine.com/medical/
Current Awareness Discovery Tools on the Internet 2020
https://www.zillman.us/white-papers/current-awareness-discovery-tools-on-the-internet/
http://www.CurrentAwarenessTools.com/
DataMelt – Computation and Visualization Environment
https://jwork.org/dmelt/
Data Mining 101 Tools and Techniques
https://iaonline.theiia.org/data-mining-101-tools-and-techniques
Data Mining Tutorial
https://www.tutorialspoint.com/data_mining/index.htm
Data Engineering Bulletin
https://tab.computer.org/tcde/bull_about.html
DataFerrett – Data Mining Tool
https://dataferrett.census.gov/
Data Fountains: Open Source Internet Resource Discovery and Metadata/Full-Text Generation Service
https://sourceforge.net/projects/datafountains/
Data Mining Amazon Web Services (AWS) Big Data – Data Lakes and Analytics
https://aws.amazon.com/big-data/
Data Mining and Knowledge Discovery Journal
http://link.springer.com/journal/10618
Data Mining and Predictive Analytics
https://abbottanalytics.blogspot.com/
Data Mining Applications in Transportation Engineering
https://www.slideshare.net/Tommy96/data-mining-applications-in-transportation-engineering
Data Mining Case Study – Mining complex financial information
https://automatedinsights.com/blog/how-financial-services-companies-can-win-over-millennials-better-customer-communication-through-automation/
Data Mining Concepts
https://msdn.microsoft.com/en-us/library/ms174949.aspx
Data Mining ebook: Theories Algorithms and Examples
https://www.routledge.com/products/9781439808382
Data Mining for the Masses
https://www.onlineprogrammingbooks.com/data-mining-masses/
Data Mining – Federal Efforts Cover a Wide Range of Uses Report
https://www.gao.gov/new.items/d04548.pdf
Data Mining Glossary
https://www.gartner.com/it-glossary/data-mining/
Data Mining Group (DMG)
https://www.dmg.org/
Data Mining in Banking and its Applications
https://www.scribd.com/doc/270947349/Data-Mining-Banking#scribd
Data Mining Oil and Gas Hydrocarbon Exploration Data
https://www.analytics-magazine.org/november-december-2011/695-how-big-data-is-changing-the-oil-a-gas-industry
Data Mining, Predictive Modeling, Business Analytics: Training, Consulting & Solutions
https://www.the-modeling-agency.com/
Data Mining Primer from Oracle
https://docs.oracle.com/cd/B28359_01/datamine.111/b28129/process.htm
Data Mining Publications from Google
https://research.google.com/pubs/DataMining.html
Data Mining Resources 2020
https://www.DataMiningResources.info/
Data Mining Resources
https://www.cs.purdue.edu/homes/ayg/CS590D/resources.html
Data Mining Resources
https://datamining.togaware.com/
Data Mining Resources
https://datamining.togaware.com/
Data Mining Resources at CCSU
https://web.ccsu.edu/datamining/resources.html
Data Mining Table Analysis Tool
https://technet.microsoft.com/en-us/library/dd299414(v=sql.100).aspx
Data Mining Techiques in CRM
http://www.data-miners.com/
Data Mining: Technology and Policy The DHS Privacy Office
https://www.dhs.gov/xlibrary/assets/privacy/privacy_rpt_datamining_200812.pdf
Data Mining: Text Mining, Visualization and Social Media
https://datamining.typepad.com/data_mining/
Data Mining Tools
https://www.icsti.org/IMG/pdf/VTTDataMiningTools.pdf
Data Mining Tools
https://thenewstack.io/six-of-the-best-open-source-data-mining-tools/
Data Mining, Web Scraping, Web Mining, Data Extraction and Screen Scraping Technology Links
https://www.connotate.com/
Data Mining, Web Mining, and Business Intelligence Solutions from Salford Systems
https://www.salford-systems.com/
Data Mining White Paper – Free Best Practices Guide
https://www.sas.com/data-mining/
Data Mining White Paper from Intel – Turning Big Data Into Big Insights
https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/big-data-visualization-turning-big-data-into-big-insights.pdf
Data Mining – Wikipedia
https://en.wikipedia.org/wiki/Data_mining
Datanami – Big Data, Big Analytics, and Big Insights
https://www.datanami.com/
Data-PASS
http://www.data-pass.org/
Data Science Toolkit
http://www.datasciencetoolkit.org/
Datasets for Data Mining and Data Science
https://www.kdnuggets.com/datasets/index.html
Data Shaping Data Mining Resources
https://www.datashaping.com/data_mining.shtml
Data Sources
https://www.the-data-mine.com/Misc/DataSource
Data Visualizations Derived From Data Mining Big Data
http://exploringdata.github.io/
Data Warehousing and Data Mining
https://www.dei.unipd.it/~capri/SI/MATERIALE/DWDM0405.pdf
DbVisualizer – The Universal Database Tool
https://www.dbvis.com/
DeepDive – Analyze Data On a Deeper Level Than Ever Before
https://deepdive.stanford.edu/
Deep Learning for Java – Open Source, Distributed, Deep Learning Library for the JVM
https://deeplearning4j.org/
Deep Web Research and Discovery Resources 2020
https://DeepWeb.us/
Digital Operating Systems Tools and Resources 2019/2020
https://www.DigitalOperatingSystems.com/
Data Warehouse, Data Mart, Data Mining and Decision Support Resources
https://www.infogoal.com/dmc/dmcdwh.htm
DiscoverText – Capture Text Data and Crunch Your Data
https://discovertext.com/
Distributed Data Mining in Credit Card Fraud Detection
http://cs.fit.edu/~pkc/papers/ieee-is99.pdf
Easy Data Mining Software
https://www.tableau.com/
Easy PDF Cloud
https://www.easypdfcloud.com/
eBiquity Research Group Blogger
https://ebiquity.umbc.edu/blogger/
Early Canadiana Online
https://www.canadiana.ca/
Elastic Web Mining Talk
https://www.slideshare.net/kkrugler/elastic-web-mining-2407818
ELKI: Environment for Developing KDD-Applications Supported by Index-Structures
https://elki-project.github.io/
EU Open Data Portal
https://data.europa.eu/euodp/en/home
Everything You Wanted to Know About Data Mining but Were Afraid to Ask by Alexander Furnas
https://www.theatlantic.com/technology/archive/2012/04/everything-you-wanted-to-know-about-data-mining-but-were-afraid-to-ask/255388/
GeneMiner
https://www.biomedcentral.com/1471-2105/8/S8/P3
Google BigQuery – Query Cloud Based Datasets
https://cloud.google.com/bigquery/
Google Open Refine 2.0 – Open Source Power Tool for Data Wranglers and Working With Messy Data
https://github.com/OpenRefine
Great War Primary Documents Archive
https://www.gwpda.org/
Healthdata.gov
https://www.healthdata.gov/
History of Data Mining by Raymond Li
https://rayli.net/blog/data/history-of-data-mining/
Howard D. Wactlar Home Page
https://www.cs.cmu.edu/~hdw/
IBM Data Mining Cognos Business Solutions
https://www.cognoise.com/index.php?topic=19929.0
Imagination Engines
https://www.Imagination-Engines.com/
Indiegogo Datasets
https://webrobots.io/indiegogo-dataset/
Information Retrieval (IR) and Information Extraction (IE) on the Web Using Hypertext Meta-Data and Structure
https://www.webir.org/
InfoVis CyberInfrastructure
https://iv.slis.indiana.edu/index.html
International Journal of Business Intelligence and Data Mining (IJBIDM)
https://www.inderscience.com/jhome.php?jcode=ijbidm
International Journal of Data Mining and Bioinformatics (IJDMB)
https://www.inderscience.com/jhome.php?jcode=ijdmb
International Journal of Data Warehousing and Mining (IJDWM)
https://www.igi-global.com/journal/international-journal-data-warehousing-mining/1085
Internet Archive
https://www.archive.org/
Inter-university Consortium for Political and Social Research (ICPSR)
https://www.icpsr.umich.edu/
InvestigateIX Search Engine and Text Mining Toolbox
https://www.mandalka.name/investigateix/
Jaspersoft® ETL – The Open Source Data Integration Platform
https://community.jaspersoft.com/project/jaspersoft-etl
Junar – The Open Data Platform
https://www.junar.com/
Kaggle – Go from Big Data to Big Analytics
https://kaggle.com/
KDD-2008
https://www.kdd2008.com/
KDD-2009
https://www.kdd.org/conferences/kdd-2009-paris-france-june-28-july-1
KDD-2010
https://www.kdd.org/conferences/kdd-2010-washington-dc-july-25-28
KDD-2011
https://www.kdd.org/conferences/kdd-2011-san-diego-ca-august-21-24-2011
KDD-2012
https://www.kdd.org/conferences/kdd-2012-august-12-16-2012-beijing-china
KDD-2014
https://www.kdd.org/kdd2014/
KDD-2015
https://www.kdd.org/kdd2015/
KDD-2016
https://www.kdd.org/kdd2016/
KDD-2017
https://www.kdd.org/kdd2017/
KDD-2018
https://www.kdd.org/kdd2018/
KDD-2019
http://www.kdd.org/kdd2019/
KDD-2020
https://www.kdd.org/kdd2020
KDnuggets: Data Mining, Web Mining, and Knowledge Discovery Guide
https://www.kdnuggets.com/
KEEL (Knowledge Extraction Based on Evolutionary Learning)
https://www.keel.es/
Kickstarter Datasets
https://webrobots.io/kickstarter-datasets/
KNIME – Konstanz Information Miner Open Source Software
https://www.knime.org/
Knowledge Discovery Resources 2020
https://www.KnowledgeDiscovery.info/
Knowledge Discovery Resources 2020 Annotated White Paper Link Compilation by Marcus P. Zillman, M.S., A.M.H.A.
https://www.KDResources.info/
Knowledge Enterprise Semantic Intelligence Suite
https://transinsight.com/
KnowleSys – Web Public Opinion Monitoring
https://www.knowlesys.com
LingPipe – Information Extraction and Data Mining Tools
https://alias-i.com/lingpipe/
LoginWorks – Advanced Solutions – Data Mining and Web Scraping
https://www.loginworks.com/
Machine Learning from Scratch
https://github.com/eriklindernoren/ML-From-Scratch
Mallet – MAchine Learning for LanguagE Toolkit
https://mallet.cs.umass.edu/
Marriott Library at the University of Utah Digital Collections
https://www.lib.utah.edu/
Marti Hearst Home Page
https://people.ischool.berkeley.edu/~hearst/
Media Patterns – Detecting Patterns in the Global Media Content
https://mediapatterns.enm.bris.ac.uk/
Megaputer – Data Mining and Text Mining Software
https://www.megaputer.com/
Microsoft® Data Mining Project – Efficient Data Exploration and Modeling
https://research.microsoft.com/en-us/projects/datamining/
Minerazzi – Your Search-and-Mine Ecosystem
https://www.minerazzi.com/
Mining Road Traffic Accident Data
https://ai-d.org/pdfs/Beshah.pdf
Mining Spatial Data of Traffic Accidents
https://www.stat-d.si/mz/mz5.1/lavrac.pdf
MIT OpenCourseWare study and certification Data Mining Discipline
https://ocw.mit.edu/courses/sloan-school-of-management/15-062-data-mining-spring-2003/
MOA (Massive Online Analysis)
https://moa.cms.waikato.ac.nz/
MoData – Big Data Resources
http://www.mo-data.com/
MonetDB Query Processing at Light Speed
https://www.monetdb.org/
Mozenda – Data Extraction and Comprehensive Web Data Gathering
https://www.mozenda.com/
National Archives, London
https://nationalarchives.gov.uk/
National Centre for Text Mining (NaCTeM)
https://www.nactem.ac.uk/
National Science Digital Library (NSDL)
https://nsdl.oercommons.org/
National Technical Information Service (NTIS)
https://www.ntis.gov/
Neural Networks in Data Mining
https://www.jatit.org/volumes/research-papers/Vol5No1/1Vol5No6.pdf
Nesstar – Publish Data on the Web
https://www.nesstar.com/
NetOwl – Entity Extraction and Entity Analytics for Big Data
https://www.netowl.com/
New York Public Library
https://www.nypl.org/
Nuix – eDiscovery and Electronic Investigation Software
http://www.nuix.com/
Observatory on Social Media (OSoMe)
https://truthy.indiana.edu/
Online News Archive
https://onlinenewsarchive.com/
OntoMiner: Bootstrapping and Populating Ontologies From Domain Specific Web Sites
https://www.public.asu.edu/~hdavulcu/VLDB-WS03.pdf
Open Data Barometer
https://www.opendataresearch.org/project/2013/odb
Open Data Handbook – Guides, Case Studies and Resources for Government and Civil Society On the What, Why and How of Open Data
https://opendatahandbook.org/
Open Data Inception
https://opendatainception.io/
Open Data Institute
https://theodi.org/
Open Data Inventory (ODIN)
https://odin.opendatawatch.com/
Open Data Network
https://www.opendatanetwork.com/
Open Datasets
https://github.com/caesar0301/awesome-public-datasets
https://www.kaggle.com/datasets
https://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public
https://aws.amazon.com/public-datasets/
https://repository.upenn.edu/mead
https://catalog.data.gov/dataset
Open Educational Resources (OER) Sources 2020
http://www.OERSources.com/
OpenMinted – Open Service Oriented e-Infrastructure for Scientific and Scholarly Text and Data Mining
https://openminted.eu/
Open/Public Data Sources
https://www.scaleunlimited.com/datasets/public-datasets/
Open Source Data Mining Tools
https://www.scaleunlimited.com/oss/open-source-data-mining-tools/
Oracle Data Mining
https://www.oracle.com/technetwork/database/options/advanced-analytics/odm/overview/index.html
Oracle Knowledge base about Big Data Mining
https://docs.oracle.com/apps/search/search.jsp?word=data+mining&product=b28359-01&book=b28129
Orange – Open Source Data Visualization and Analysis for Novice and Experts
https://orange.biolab.si/
Overview – Open Source Document Mining
https://blog.overviewdocs.com/
PC AI Magazine Artificial Intelligence
https://www.pcai.com/
Pentaho BI Project – Open Source Business Intelligence
https://www.pentaho.com/
PEPITe S.A. – Unlock Your Knowledge
https://www.pepite.be/
Prediction Markets 2020
https://www.PredictionMarkets.com/
Predictive Model Markup Language (PMML)- SourceForge.net: Project Info
https://sourceforge.net/projects/pmml
Predictive Model Markup Language (PMML)
https://xml.coverpages.org/pmml.html
Probabilistic Data Models for Web Analytics and Data Mining
https://highlyscalable.wordpress.com/2012/05/01/probabilistic-structures-web-analytics-data-mining/
Proxycrawl – Stay Anonymous While Crawling the Web
https://proxycrawl.com/
QDA Miner Lite (Freeware)
https://provalisresearch.com/products/qualitative-data-analysis-software/freeware/
QL2 Software – Unstructured Data Management and Web Mining Software
https://www.ql2.com/
QueryTree – Explore Data Without Code
https://querytreeapp.com/
Raghu Ramakrishnan Home Page
https://pages.cs.wisc.edu/~raghu/
RapidMiner – Open Source Data Mining Tool
https://rapid-i.com/content/blogcategory/10/69/
Rattle – Data Mining Toolkit in R
https://code.google.com/p/rattle/
re3data.org – 2,000 Data Repositories
https://www.re3data.org/
Recommended Books on Data Mining
https://www.albionresearch.com/books/data_mining.php
reSearcher
https://researcher.sfu.ca/
Rexer Analytics – Analytic and CRM Consulting
https://www.rexeranalytics.com/
Ron Kohavi Home Page
https://robotics.stanford.edu/~ronnyk/
SAS – Data and Text Mining
https://www.sas.com/technologies/analytics/datamining/index.html
SAS What is Data Mining
https://www.sas.com/en_us/insights/analytics/data-mining.html
SCaVis – Scientific Computation and Visualization Environment
https://jwork.org/scavis/
Scientific Data Repository – Real Time Visualization and Exploration Techniques
https://www.mlvis.com/platform.php
Screen-Scraper – Data Extraction Software and Services
https://www.screen-scraper.com/
Searching the Internet 2020
https://www.SearchingTheInternet.info/
Semantic Scholar – Free Scientific Literature Search and Discovery
https://allenai.org/semantic-scholar/
SIGKDD – ACM Special Interest Group – Knowledge Discovery in Data and Data Mining
https://en.wikipedia.org/wiki/SIGKDD
Slideshare Presentations About Data Mining – a List
http://www.kdnuggets.com/2014/11/most-popular-slideshare-presentations-data-mining.html
Slideshare Presentation about Data Mining
http://www.slideshare.net/smj/data-mining-slides
Smithsonian/NASA Astrophysics Data System (ADS)
http://adsabs.harvard.edu/index.html
Snorkel: A System for Fast Training Data Creation
https://hazyresearch.github.io/snorkel/
Social Buzz Bot 2020 – Business Intelligence Data Mining for Information Discovery from Social Communities [PDF file download]
https://www.SocialBuzzBot.com/
Software Suites for Data Mining, Analytics, and Knowledge Discovery
https://www.kdnuggets.com/software/suites.html
Special Interest Group – Knowledge Discovery in Data and Data Mining – SIGKDD Explorations Newsletter
https://www.kdd.org/explorations/
SPMF – Open Source Data Mining Library
https://www.philippe-fournier-viger.com/spmf/
Stanford Data Mining Course cs345a course handouts
https://web.stanford.edu/class/cs345a/handouts.html
Statistical Analysis and Data Mining
https://onlinelibrary.wiley.com/journal/10.1002/%28ISSN%291932-1872
Statistics Resources and Big Data 2020
https://www.StatisticsResources.info/
Statoo Statistical Consulting + Data Analysis + Data Mining
https://www.statoo.com/en/
Streaming Data Mining
https://www.cs.yale.edu/homes/el327/papers/streaming_data_mining.pdf
Talend Open Data Solutions
https://www.talend.com/
Tanagra Project – Free Data Mining Software for Academic and Research Purposes
https://eric.univ-lyon2.fr/~ricco/tanagra/en/tanagra.html
Text Data Mining
https://people.ischool.berkeley.edu/~hearst/talks/dm-talk/
Text Mining for Scholarly Communications and Repositories
https://www.nactem.ac.uk/tm-ukoln.php
The Archaeology Data Service (ADS)
https://archaeologydataservice.ac.uk/
The Centre for Contemporary Canadian Art – Canadian Art Database Project
https://ccca.concordia.ca/
The Data Mine
https://www.the-data-mine.com/
The History Data Service (HDS)
https://hds.essex.ac.uk/
The National Centre for Text Mining: Aims and Objectives by Sophia Ananiadou, Julia Chruszcz, John Keane, John McNaught and Paul Watry
https://www.ariadne.ac.uk/issue42/ananiadou/
The New York Times Article Search API
https://developer.nytimes.com/
The Open Access Digital Library
https://grweb.coalliance.org/oadl/oadl.html
The Ultimate Artificial Intelligence Resources Guide by Kyle Poyar
https://labs.openviewpartners.com/artificial-intelligence-resources-guide/
Togaware – Data Mining Resources
https://datamining.togaware.com/
T-Rex (Trainable Relation Extraction)
https://sourceforge.net/projects/t-rex/
Try Data Mining Queries Interactively Online using sample dataset
https://overpass-turbo.eu/
UC Irvine Machine Learning Repository
https://archive.ics.uci.edu/ml/index.php
Udemy Course About Data Mining
https://www.udemy.com/data-mining/
University of Florida Digital Collections (UFDC)
http://ufdc.ufl.edu/
University of North Texas Digital Collections
https://digital.library.unt.edu/explore/collections/
Using the Internet As a Dynamic Resource Tool for Knowledge Discovery 2019
http://www.zillman.us/white-papers/using-the-internet-as-a-dynamic-resource-tool-for-knowledge-discovery/
Wallmine – Wall Street Data Mining
https://wallmine.com/
Web Curator Tool (WCT) – Management of Selective Web Harvesting Process
https://webcurator.sourceforge.net/
Web Data Extractors 2020
https://www.WebDataExtractors.com/
Webhose.io – Turn Unstructured Web Content Into Machine-Readable Data Feeds That You Can Consume On Demand
https://webhose.io/
Weka 3: Data Mining Software in Java
https://www.cs.waikato.ac.nz/~ml/weka/
Weka 3 – Data Mining with Open Source Machine Learning Software in Java
https://www.cs.waikato.ac.nz/~ml/weka/index.html
World Bank Datasets For Data Mining
https://datacatalog.worldbank.org//
Zentut – What is Data Mining Tutorial
https://www.zentut.com/data-mining/what-is-data-mining/