Data Mining Resources 2022

Data mining and knowledge discovery is a quickly evolving field that is part of the portfolio of CI, BI and KM professionals, law librarians, research analysts, infopros, data scientists, data journalists and students in college and graduate programs. This expansive bibliography comprises a wealth of information, resources, tools, techniques and applications, as well as links to many open datasets. The subject matter includes data mining, data scrapping, data aggregation, big data and big analytics. The resources include: ebooks and glossaries, research papers, video tutorials and online training, APIs, open source web data extraction tools, datasets, bibliographies, case studies, scientific and academic papers and substantive articles, as well as training and certifications on data mining, and open source code.

10 Powerful Data Mining Tools for 2022
https://hevodata.com/learn/data-mining-tools/

25 Best Data Mining Tools in 2022
https://medium.com/datatobiz/25-best-data-mining-tools-in-2022-65e77c905a2b

50 Data Mining Resources – Tutorials, Techniques and More
https://www.ngdata.com/data-mining-resources/

80legs – Easy Web Scraping Tools and Cloud-Based Web Crawling
https://www.80legs.com/

Advanced Analytics – Unstructured Data Mining
https://advisory.kpmg.us/deal-advisory/data-driven-tech/advanced-analytics.html

An Evaluation of Data Mining Methods and Tools
https://www.idi.ntnu.no/~dingsoyr/project/report.html

ACM SIGKDD: Current Explorations Issue – The mission of KDD is to promote the rapid maturation of the field of knowledge discovery in data and data-mining
https://www.kdd.org/explorations/issue.php?issue=current

Apache Pig – Platform for Analyzing Large Datasets
https://pig.apache.org/

Applications of Modern Heuristics and Data Mining Techniques – Thesis
https://www.people.vcu.edu/~mmanic/papers/grads/McCarty_08_MHandAdvancDMTechniqs.pdf

ARTstor – Digital Image Library for Education and Scholarship
https://www.artstor.org/

Astera Software – Mine insights from unstructured documents, such as PDFs, DOCs, RTFs, XLSXs and others with Astera ReportMiner
https://discover.astera.com/data-mining-trial/

Best Data Mining Software and Tools 2022
https://www.enterprisenetworkingplanet.com/data-center/data-mining-tools/

Best Data Mining Tools – Reviews, Pricing and Demos
https://www.softwareadvice.com/bi/data-mining-comparison/

BI-DW – Business Intelligence and Data Warehousing Directory
https://www.bi-dw.info/

Bot Research 2022
https://www.BotResearch.info/

Business Intelligence Resources 2022
https://www.BIResources.info/

CCSU – Data Mining
https://web.ccsu.edu/datamining/

Center for Automated Learning and Discovery – Machine Learning Department
https://www.ml.cmu.edu/

Cogitum Co-Citer
https://www.cogitum.com/co-tracker-text/more.shtml

COGNOiSe Analytics – The largest independent IBM Cognos collaboration community
https://www.cognoise.com/index.php/board,184.0.html

Contentmine – Text and Data Mining Open Source Tools
https://github.com/ContentMine

Copyright Clearance Center
https://www.copyright.com/

Current Awareness Tools 2022
https://www.CurrentAwarenessTools.com/

DataMelt – Computation and Visualization Environment
https://jwork.org/dmelt/

Data Engineering Bulletin
https://tab.computer.org/tcde/bull_about.html

Data Fountains: Open Source Internet Resource Discovery and Metadata/Full-Text Generation Service
https://sourceforge.net/projects/datafountains/

Data Mining 101 Tools and Techniques
https://iaonline.theiia.org/data-mining-101-tools-and-techniques

Data Mining Amazon Web Services (AWS) Big Data – Data Lakes and Analytics
https://aws.amazon.com/big-data/

Data Mining and Knowledge Discovery Journal
https://link.springer.com/journal/10618

Data Mining and Predictive Analytics
https://abbottanalytics.blogspot.com/

Data Mining Case Study – Mining complex financial information 
https://automatedinsights.com/blog/how-financial-services-companies-can-win-over-millennials-better-customer-communication-through-automation/

Data Mining Concepts
https://msdn.microsoft.com/en-us/library/ms174949.aspx

Data Mining Definition – Investopedia
https://www.investopedia.com/terms/d/datamining.asp

Data Mining ebook: Theories Algorithms and Examples
https://www.routledge.com/products/9781439808382

Data Mining for the Masses
https://www.onlineprogrammingbooks.com/data-mining-masses/

Data Mining – Federal Efforts Cover a Wide Range of Uses Report
https://www.gao.gov/new.items/d04548.pdf

Data Mining Glossary
https://www.gartner.com/it-glossary/data-mining/

Data Mining Group (DMG)
https://www.dmg.org/

Data Mining in Banking and its Applications
https://www.scribd.com/doc/270947349/Data-Mining-Banking#scribd

Data Mining, Predictive Modeling, Business Analytics: Training, Consulting & Solutions
https://www.the-modeling-agency.com/

Data Mining Primer from Oracle
https://docs.oracle.com/cd/B28359_01/datamine.111/b28129/process.htm

Data Mining Publications from Google
https://research.google/pubs/?area=data-mining-and-modeling

Data Mining Resources 2022
https://www.DataMiningResources.info/

Data Mining Resources
https://www.cs.purdue.edu/homes/ayg/CS590D/resources.html

Data Mining Resources
https://datamining.togaware.com/

Data Mining Table Analysis Tool
https://technet.microsoft.com/en-us/library/dd299414(v=sql.100).aspx

Data Mining Techiques in CRM
https://www.data-miners.com/

Data Mining: Technology and Policy The DHS Privacy Office
http://www.dhs.gov/xlibrary/assets/privacy/privacy_rpt_datamining_200812.pdf

Data Mining: Text Mining, Visualization and Social Media
https://datamining.typepad.com/data_mining/

Data Mining: The Complete Guide for 2022
https://bootcamp.cvn.columbia.edu/blog/data-mining-guide/

Data Mining Tools 
https://thenewstack.io/six-of-the-best-open-source-data-mining-tools/

Data Mining Tutorial
https://www.tutorialspoint.com/data_mining/index.htm

Data Mining, Web Scraping, Web Mining, Data Extraction and Screen Scraping Technology Links
https://www.import.io/

Data Mining, Web Mining, and Business Intelligence Solutions from Salford Systems – Salford Predictive Modeler®
https://www.salford-systems.com/

Data Mining White Paper – Free Best Practices Guide
https://www.sas.com/data-mining/

Data Mining White Paper from Intel – Turning Big Data Into Big Insights
https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/big-data-visualization-turning-big-data-into-big-insights.pdf

Data Mining – Wikipedia
https://en.wikipedia.org/wiki/Data_mining

Dataminr – Real-Time AI for Event and Risk Detection
https://www.dataminr.com/

Datanami – Big Data, Big Analytics, and Big Insights
https://www.datanami.com/

Data-PASS
https://www.data-pass.org/

Datasets for Data Mining, Data Science and Machine Learning
https://www.kdnuggets.com/datasets/index.html

Data Shaping Data Mining Resources
https://www.datashaping.com/data_mining.shtml

Data Sources
https://www.the-data-mine.com/Misc/DataSource

Data Visualizations Derived From Data Mining Big Data
https://exploringdata.github.io/

Data Warehousing and Data Mining
https://www.dei.unipd.it/~capri/SI/MATERIALE/DWDM0405.pdf

DbVisualizer – The Universal Database Tool
https://www.dbvis.com/

DeepDive – Analyze Data On a Deeper Level Than Ever Before
https://deepdive.stanford.edu/

Deep Web Research and Discovery Resources 2022
https://DeepWeb.us/

Digital Operating Systems Tools and Resources 2022
https://www.DigitalOperatingSystems.com/

Data Warehouse, Data Mart, Data Mining and Decision Support Resources
https://www.infogoal.com/dmc/dmcdwh.htm

DiscoverText – Capture Text Data and Crunch Your Data
https://discovertext.com/

Distributed Data Mining in Credit Card Fraud Detection
https://cs.fit.edu/~pkc/papers/ieee-is99.pdf

Easy Data Mining Software
https://www.tableau.com/

Easy PDF Cloud
https://www.easypdfcloud.com/

eBiquity Research Group Blogger
https://ebiquity.umbc.edu/blogger/

Early Canadiana Online
https://www.canadiana.ca/

Elastic Web Mining Talk
https://www.slideshare.net/kkrugler/elastic-web-mining-2407818

EU Open Data Portal
https://data.europa.eu/euodp/en/home

Everything You Wanted to Know About Data Mining but Were Afraid to Ask by Alexander Furnas
https://www.theatlantic.com/technology/archive/2012/04/everything-you-wanted-to-know-about-data-mining-but-were-afraid-to-ask/255388/

GeneMiner –
https://www.biomedcentral.com/1471-2105/8/S8/P3

Google BigQuery – Query Cloud Based Datasets
https://cloud.google.com/bigquery/

Google Open Refine 2.0 – Open Source Power Tool for Data Wranglers and Working With Messy Data
https://github.com/OpenRefine

Great War Primary Documents Archive
https://www.gwpda.org/

GROBID
https://github.com/kermitt2/grobid

Healthdata.gov
https://www.healthdata.gov/

History of Data Mining by Raymond Li
https://rayli.net/blog/data/history-of-data-mining/

Imagination Engines
https://www.Imagination-Engines.com/

Indiegogo Datasets
https://webrobots.io/indiegogo-dataset/

Information Retrieval (IR) and Information Extraction (IE) on the Web Using Hypertext Meta-Data and Structure
http://www.webir.org/

International Journal of Business Intelligence and Data Mining (IJBIDM)
http://www.inderscience.com/jhome.php?jcode=ijbidm

International Journal of Data Mining and Bioinformatics (IJDMB)
http://www.inderscience.com/jhome.php?jcode=ijdmb

International Journal of Data Warehousing and Mining (IJDWM)
http://www.igi-global.com/journal/international-journal-data-warehousing-mining/1085

Internet Archive
http://www.archive.org/

Inter-university Consortium for Political and Social Research (ICPSR)
http://www.icpsr.umich.edu/

Jaspersoft® ETL – The Open Source Data Integration Platform
https://community.jaspersoft.com/project/jaspersoft-etl

Junar – The Open Data Platform
http://www.junar.com/

Kaggle – Go from Big Data to Big Analytics
https://kaggle.com/

KDD-2008
https://www.kdd2008.com/

KDD-2009
https://www.kdd.org/conferences/kdd-2009-paris-france-june-28-july-1

KDD-2010
https://www.kdd.org/conferences/kdd-2010-washington-dc-july-25-28

KDD-2011
https://www.kdd.org/conferences/kdd-2011-san-diego-ca-august-21-24-2011

KDD-2012
https://www.kdd.org/conferences/kdd-2012-august-12-16-2012-beijing-china

KDD-2014
https://www.kdd.org/kdd2014/

KDD-2015
https://www.kdd.org/kdd2015/

KDD-2016
https://www.kdd.org/kdd2016/

KDD-2017
https://www.kdd.org/kdd2017/

KDD-2018
https://www.kdd.org/kdd2018/

KDD-2019
https://www.kdd.org/kdd2019/

KDD-2020
http://www.kdd.org/kdd2020

KDD-2021
http://www.kdd.org/kdd2021

KDD-2022
https://www.kdd.org/kdd2022/

KDnuggets is a leading site on Data Science, Machine Learning, AI and Analytics
https://www.kdnuggets.com/

KEEL (Knowledge Extraction Based on Evolutionary Learning)
https://www.keel.es/

Kickstarter Datasets
https://webrobots.io/kickstarter-datasets/

KNIME – End to End Data Science
https://www.knime.org/

Knowledge Discovery Resources 2022
https://www.KnowledgeDiscovery.info/

Knowledge Discovery Resources 2022 Annotated White Paper Link Compilation by Marcus P. Zillman, M.S., A.M.H.A.
https://www.KDResources.info/

Knowledge Enterprise Semantic Intelligence Suite
https://transinsight.com/

KnowleSys – Web Intelligence Monitoring
https://www.knowlesys.com

LingPipe – Information Extraction and Data Mining Tools
https://www.predictiveanalyticstoday.com/lingpipe/

LoginWorks – Advanced Solutions – Data Mining and Web Scraping
http://www.loginworks.com/

Machine Learning from Scratch
https://github.com/eriklindernoren/ML-From-Scratch

Mallet – MAchine Learning for LanguagE Toolkit
https://mallet.cs.umass.edu/

Marriott Library at the University of Utah Digital Collections
https://www.lib.utah.edu/

Marti Hearst Home Page
https://people.ischool.berkeley.edu/~hearst/

Megaputer – Data Mining and Text Mining Software
https://www.megaputer.com/

Microsoft® Data Mining Project – Efficient Data Exploration and Modeling
http://research.microsoft.com/en-us/projects/datamining/

Minerazzi – Your Search-and-Mine Ecosystem
https://www.minerazzi.com/

Mining Road Traffic Accident Data
http://ai-d.org/pdfs/Beshah.pdf

MIT OpenCourseWare study and certification Data Mining Discipline
https://ocw.mit.edu/courses/sloan-school-of-management/15-062-data-mining-spring-2003/

MOA (Massive Online Analysis)
https://moa.cms.waikato.ac.nz/

MoData – Big Data Resources
https://www.mo-data.com/

MonetDB Query Processing at Light Speed
https://www.monetdb.org/

Mozenda – Data Extraction and Comprehensive Web Data Gathering
https://www.mozenda.com/

National Archives, London
https://nationalarchives.gov.uk/

National Centre for Text Mining (NaCTeM)
https://www.nactem.ac.uk/

National Science Digital Library (NSDL)
https://nsdl.oercommons.org/

National Technical Information Service (NTIS)
https://www.ntis.gov/

Neural Networks in Data Mining
https://www.jatit.org/volumes/research-papers/Vol5No1/1Vol5No6.pdf

Nesstar – Publish Data on the Web
https://www.nesstar.com/

NetOwl – Entity Extraction and Entity Analytics for Big Data
https://www.netowl.com/

New York Public Library
https://www.nypl.org/

Nuix – eDiscovery and Electronic Investigation Software
https://www.nuix.com/

Observatory on Social Media (OSoMe)
https://osome.iu.edu/

OntoMiner: Bootstrapping and Populating Ontologies From Domain Specific Web Sites
https://www.public.asu.edu/~hdavulcu/VLDB-WS03.pdf

Open Data Handbook – Guides, Case Studies and Resources for Government and Civil Society On the What, Why and How of Open Data
https://opendatahandbook.org/

Open Data Inception
https://opendatainception.io/

Open Data Institute
https://theodi.org/

Open Data Inventory (ODIN)
https://odin.opendatawatch.com/

Open Data Network
https://www.opendatanetwork.com/

Open Datasets

Open Educational Resources (OER) Sources 2022
http://www.OERSources.com/

OpenMinted – Open Service Oriented e-Infrastructure for Scientific and Scholarly Text and Data Mining
http://openminted.eu/

Open/Public Data Sources
http://www.scaleunlimited.com/datasets/public-datasets/

Open Source Data Mining Tools
https://www.scaleunlimited.com/oss/open-source-data-mining-tools/

Oracle Data Mining
https://www.oracle.com/technetwork/database/options/advanced-analytics/odm/overview/index.html

Orange – Open Source Data Visualization and Analysis for Novice and Experts
https://orange.biolab.si/

Overview – Open Source Document Mining
https://blog.overviewdocs.com/

PC AI Magazine Artificial Intelligence
https://www.pcai.com/

PEPITe S.A. – Unlock Your Knowledge
https://www.pepite.be/

Prediction Markets 2022
https://www.PredictionMarkets.com/

Predictive Model Markup Language (PMML)- SourceForge.net: Project Info
https://sourceforge.net/projects/pmml

Predictive Model Markup Language (PMML)
https://xml.coverpages.org/pmml.html

Probabilistic Data Models for Web Analytics and Data Mining
https://highlyscalable.wordpress.com/2012/05/01/probabilistic-structures-web-analytics-data-mining/

Proxycrawl – Stay Anonymous While Crawling the Web
https://proxycrawl.com/

QDA Miner Lite (Freeware)
https://provalisresearch.com/products/qualitative-data-analysis-software/freeware/

QL2 Software – Unstructured Data Management and Web Mining Software
http://www.ql2.com/

QueryTree – Explore Data Without Code
https://querytreeapp.com/

Raghu Ramakrishnan Home Page
http://pages.cs.wisc.edu/~raghu/

RapidMiner – Open Source Data Mining Tool
https://rapid-i.com/content/blogcategory/10/69/

Rattle – Data Mining Toolkit in R
https://code.google.com/p/rattle/

re3data.org – 2,000 Data Repositories
https://www.re3data.org/

Recommended Books on Data Mining
https://www.albionresearch.com/books/data_mining.php

Rexer Analytics – Analytic and CRM Consulting
https://www.rexeranalytics.com/

Ron Kohavi Home Page
https://robotics.stanford.edu/~ronnyk/

SAS – Data and Text Mining
https://www.sas.com/technologies/analytics/datamining/index.html

SAS What is Data Mining
https://www.sas.com/en_us/insights/analytics/data-mining.html

Scientific Data Repository – Real Time Visualization and Exploration Techniques
https://www.mlvis.com/platform.php

Screen-Scraper – Data Extraction Software and Services
https://www.screen-scraper.com/

Searching the Internet 2022
https://www.SearchingTheInternet.info/

Semantic Scholar – Free Scientific Literature Search and Discovery
https://www.semanticscholar.org/

SIGKDD – ACM Special Interest Group – Knowledge Discovery in Data and Data Mining
https://en.wikipedia.org/wiki/SIGKDD
https://www.kdd.org/

Slideshare Presentations About Data Mining – a List
https://www.kdnuggets.com/2014/11/most-popular-slideshare-presentations-data-mining.html

Slideshare Presentation about Data Mining 
https://www.slideshare.net/smj/data-mining-slides

Smithsonian/NASA Astrophysics Data System (ADS)
https://adsabs.harvard.edu/index.html

Social Buzz Bot 2022 – Business Intelligence Data Mining for Information Discovery from Social Communities [PDF file download]
https://www.SocialBuzzBot.com/

Software Suites for Data Mining, Analytics, and Knowledge Discovery
https://www.kdnuggets.com/software/suites.html

Special Interest Group – Knowledge Discovery in Data and Data Mining – SIGKDD Explorations Newsletter
https://www.kdd.org/explorations/

SPMF – Open Source Data Mining Library
https://www.philippe-fournier-viger.com/spmf/

Stanford Data Mining Course cs345a course handouts
https://web.stanford.edu/class/cs345a/handouts.html

Statistical Analysis and Data Mining
https://onlinelibrary.wiley.com/journal/10.1002/%28ISSN%291932-1872

Statistics Resources and Big Data 2022
https://www.StatisticsResources.info/

Statoo Statistical Consulting + Data Analysis + Data Mining
https://www.statoo.com/en/

Streaming Data Mining
https://www.cs.yale.edu/homes/el327/papers/streaming_data_mining.pdf

Talend Open Data Solutions
https://www.talend.com/

Tanagra Project – Free Data Mining Software for Academic and Research Purposes
https://eric.univ-lyon2.fr/~ricco/tanagra/en/tanagra.html

Text Mining
https://www.istl.org/17-spring/internet.html

Text Mining for Scholarly Communications and Repositories
https://www.nactem.ac.uk/tm-ukoln.php

The Archaeology Data Service (ADS)
https://archaeologydataservice.ac.uk/

The Centre for Contemporary Canadian Art – Canadian Art Database Project
https://ccca.concordia.ca/

The Data Mine
https://www.the-data-mine.com/

The Hackathon Guide for Aspiring Data Scientists
https://www.kdnuggets.com/2019/07/hackathon-guide-aspiring-data-scientists.html

The History Data Service (HDS)
https://hds.essex.ac.uk/

The National Centre for Text Mining: Aims and Objectives by Sophia Ananiadou, Julia Chruszcz, John Keane, John McNaught and Paul Watry
https://www.ariadne.ac.uk/issue42/ananiadou/

The New York Times Article Search API
https://developer.nytimes.com/

The Open Access Digital Library
https://grweb.coalliance.org/oadl/oadl.html

The Ultimate Artificial Intelligence Resources Guide by Kyle Poyar
https://labs.openviewpartners.com/artificial-intelligence-resources-guide/

Togaware – Data Mining Resources
https://datamining.togaware.com/

T-Rex (Trainable Relation Extraction)
https://sourceforge.net/projects/t-rex/

Try Data Mining Queries Interactively Online using sample dataset
https://overpass-turbo.eu/

UC Irvine Machine Learning Repository
https://archive.ics.uci.edu/ml/index.php

Udemy Course About Data Mining
https://www.udemy.com/data-mining/

University of Florida Digital Collections (UFDC)
https://ufdc.ufl.edu/

University of North Texas Digital Collections
https://digital.library.unt.edu/explore/collections/

Using the Internet As a Dynamic Resource Tool for Knowledge Discovery 2021
https://www.zillman.us/white-papers/using-the-internet-as-a-dynamic-resource-tool-for-knowledge-discovery/

VentureSource – Global Database on Companies Backed by Venture Capital and Private Equity
https://www.dowjones.com/products/pevc/

Wallmine – Wall Street Data Mining
https://wallmine.com/

Web Data Extractors 2022
https://www.WebDataExtractors.com/

Web-Harvest – Open Source Web Data Extraction Tool written in Java
https://web-harvest.sourceforge.net/

Web Harvesting by Russell Kay
https://www.computerworld.com/s/article/93919/Web_Harvesting?taxonomyId=062

Webz.io – Turn Unstructured Web Content Into Machine-Readable Data Feeds That You Can Consume On Demand
https://webz.io/

Weka 3 – Data Mining with Open Source Machine Learning Software in Java
https://www.cs.waikato.ac.nz/~ml/weka/index.html

What is Data Mining? – IBM
https://www.ibm.com/cloud/learn/data-mining

White Papers 2022 by Marcus P. Zillman, M.S., A.M.H.A.
https://www.WhitePapers.us/

WizSoft – Data and Text Mining
https://www.wizsoft.com/

World Bank Datasets For Data Mining
https://data.worldbank.org/data-catalog/research-datasets-analytical-tools

YouTube Analytics and Data Mining
https://www.nextanalytics.com/wp-content/uploads/2015/01/how-to-data-mine-analyze-youtube-excel-macro-addin.pdf

Zentut – What is Data Mining Tutorial 
https://www.zentut.com/data-mining/what-is-data-mining/

Posted in: Big Data, Information Architecture, Information Mapping, KM, Legal Research, Legal Technology, Open Source, Technology Trends