Class information for:
Level 1: INFORMATION EXTRACTION//WEB DATA EXTRACTION//WRAPPER INDUCTION

Basic class information

Class id #P Avg. number of
references
Database coverage
of references
17660 588 27.0 24%



Bar chart of Publication_year

Last years might be incomplete

Hierarchy of classes

The table includes all classes above and classes immediately below the current class.



Cluster id Level Cluster label #P
9 4 COMPUTER SCIENCE, THEORY & METHODS//COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE//COMPUTER SCIENCE, INFORMATION SYSTEMS 1247339
20 3       COMPUTER SCIENCE, INFORMATION SYSTEMS//COMPUTER SCIENCE, THEORY & METHODS//COMPUTER SCIENCE, SOFTWARE ENGINEERING 118625
164 2             RECOMMENDER SYSTEMS//COLLABORATIVE FILTERING//COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE 22632
17660 1                   INFORMATION EXTRACTION//WEB DATA EXTRACTION//WRAPPER INDUCTION 588

Terms with highest relevance score



rank Term termType Chi square Shr. of publ. in
class containing
term
Class's shr. of
term's tot. occurrences
#P with
term in
class
1 INFORMATION EXTRACTION authKW 824187 17% 16% 98
2 WEB DATA EXTRACTION authKW 614950 3% 79% 15
3 WRAPPER INDUCTION authKW 585071 2% 87% 13
4 WRAPPER GENERATION authKW 432744 2% 83% 10
5 DEEP WEB authKW 379811 3% 46% 16
6 WEB INFORMATION EXTRACTION authKW 302132 1% 73% 8
7 HIDDEN WEB authKW 221559 1% 53% 8
8 ATTRIBUTE EXTRACTION authKW 155789 1% 100% 3
9 AUTOMATIC WRAPPER GENERATION authKW 155789 1% 100% 3
10 DATA MINING WEB BASED INFORMATION authKW 155789 1% 100% 3

Web of Science journal categories



Rank Term Chi square Shr. of publ. in
class containing
term
Class's shr. of
term's tot. occurrences
#P with
term in
class
1 Computer Science, Information Systems 23062 51% 0% 301
2 Computer Science, Artificial Intelligence 17735 45% 0% 266
3 Computer Science, Software Engineering 6228 24% 0% 141
4 Computer Science, Theory & Methods 4810 25% 0% 148
5 Computer Science, Hardware & Architecture 677 6% 0% 37
6 Information Science & Library Science 321 4% 0% 23
7 Engineering, Electrical & Electronic 310 16% 0% 92
8 Computer Science, Interdisciplinary Applications 151 5% 0% 29
9 Operations Research & Management Science 77 3% 0% 18
10 Telecommunications 42 3% 0% 17

Address terms



Rank Term Chi square Shr. of publ. in
class containing
term
Class's shr. of
term's tot. occurrences
#P with
term in
class
1 COMP SCI TECHNOL PROGRAM 93471 1% 60% 3
2 INFORMAT COMMUN COMP TECHNOL 69238 0% 67% 2
3 DBAI 58417 1% 38% 3
4 AD T INTELLIGENT INTERNET AGENT 51930 0% 100% 1
5 CELLULAR AUTOMATA KNOWLEDGE ENGN CAKE 51930 0% 100% 1
6 CITESEER PROJECT 51930 0% 100% 1
7 CITIZEN OBSERV 51930 0% 100% 1
8 CLUSTER EXCELLENCE ASIA EUROPE 51930 0% 100% 1
9 COMP EGNN 51930 0% 100% 1
10 COMP INTELLIGENT 51930 0% 100% 1

Journals



Rank Term Chi square Shr. of publ. in
class containing
term
Class's shr. of
term's tot. occurrences
#P with
term in
class
1 SIGMOD RECORD 36988 3% 4% 20
2 WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS 35363 3% 4% 17
3 ACM TRANSACTIONS ON THE WEB 27169 2% 5% 10
4 DATA & KNOWLEDGE ENGINEERING 20424 4% 2% 22
5 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 16568 5% 1% 31
6 KNOWLEDGE AND INFORMATION SYSTEMS 7078 2% 1% 12
7 JOURNAL OF INTELLIGENT INFORMATION SYSTEMS 7016 2% 2% 9
8 JOURNAL OF UNIVERSAL COMPUTER SCIENCE 6775 3% 1% 15
9 LECTURE NOTES IN COMPUTER SCIENCE 6381 18% 0% 105
10 IEEE INTELLIGENT SYSTEMS 5940 2% 1% 9

Author Key Words



Rank Term Chi square Shr. of publ. in
class containing
term
Class's shr. of
term's tot. occurrences
#P with
term in
class
LCSH search Wikipedia search
1 INFORMATION EXTRACTION 824187 17% 16% 98 Search INFORMATION+EXTRACTION Search INFORMATION+EXTRACTION
2 WEB DATA EXTRACTION 614950 3% 79% 15 Search WEB+DATA+EXTRACTION Search WEB+DATA+EXTRACTION
3 WRAPPER INDUCTION 585071 2% 87% 13 Search WRAPPER+INDUCTION Search WRAPPER+INDUCTION
4 WRAPPER GENERATION 432744 2% 83% 10 Search WRAPPER+GENERATION Search WRAPPER+GENERATION
5 DEEP WEB 379811 3% 46% 16 Search DEEP+WEB Search DEEP+WEB
6 WEB INFORMATION EXTRACTION 302132 1% 73% 8 Search WEB+INFORMATION+EXTRACTION Search WEB+INFORMATION+EXTRACTION
7 HIDDEN WEB 221559 1% 53% 8 Search HIDDEN+WEB Search HIDDEN+WEB
8 ATTRIBUTE EXTRACTION 155789 1% 100% 3 Search ATTRIBUTE+EXTRACTION Search ATTRIBUTE+EXTRACTION
9 AUTOMATIC WRAPPER GENERATION 155789 1% 100% 3 Search AUTOMATIC+WRAPPER+GENERATION Search AUTOMATIC+WRAPPER+GENERATION
10 DATA MINING WEB BASED INFORMATION 155789 1% 100% 3 Search DATA+MINING+WEB+BASED+INFORMATION Search DATA+MINING+WEB+BASED+INFORMATION

Core articles

The table includes core articles in the class. The following variables is taken into account for the relevance score of an article in a cluster c:
(1) Number of references referring to publications in the class.
(2) Share of total number of active references referring to publications in the class.
(3) Age of the article. New articles get higher score than old articles.
(4) Citation rate, normalized to year.



Rank Reference # ref.
in cl.
Shr. of ref. in
cl.
Citations
1 SLEIMAN, HA , CORCHUELO, R , (2013) A SURVEY ON REGION EXTRACTORS FROM WEB DOCUMENTS.IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. VOL. 25. ISSUE 9. P. 1960 -1981 34 85% 12
2 SLEIMAN, HA , CORCHUELO, R , (2013) TEX: AN EFFICIENT AND EFFECTIVE UNSUPERVISED WEB INFORMATION EXTRACTOR.KNOWLEDGE-BASED SYSTEMS. VOL. 39. ISSUE . P. 109-123 25 96% 10
3 SLEIMAN, HA , CORCHUELO, R , (2014) TRINITY: ON USING TRINARY TREES FOR UNSUPERVISED WEB DATA EXTRACTION.IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. VOL. 26. ISSUE 6. P. 1544 -1556 20 100% 5
4 JIMENEZ, P , CORCHUELO, R , SLEIMAN, HA , (2016) ARIEX: AUTOMATED RANKING OF INFORMATION EXTRACTORS.KNOWLEDGE-BASED SYSTEMS. VOL. 93. ISSUE . P. 84 -108 25 74% 0
5 ROUDAKI, A , KONG, J , ZHANG, K , (2016) SPECIFICATION AND DISCOVERY OF WEB PATTERNS: A GRAPH GRAMMAR APPROACH.INFORMATION SCIENCES. VOL. 328. ISSUE . P. 528 -545 16 100% 0
6 SHI, SS , LIU, CF , SHEN, Y , YUAN, CF , HUANG, YH , (2015) AUTORM: AN EFFECTIVE APPROACH FOR AUTOMATIC WEB DATA RECORD MINING.KNOWLEDGE-BASED SYSTEMS. VOL. 89. ISSUE . P. 314 -331 16 100% 0
7 JIMENEZ, P , CORCHUELO, R , (2016) ON LEARNING WEB INFORMATION EXTRACTION RULES WITH TANGO.INFORMATION SYSTEMS. VOL. 62. ISSUE . P. 74 -103 15 94% 0
8 SLEIMAN, HA , CORCHUELO, R , (2014) A CLASS OF NEURAL-NETWORK-BASED TRANSDUCERS FOR WEB INFORMATION EXTRACTION.NEUROCOMPUTING. VOL. 135. ISSUE . P. 61-68 14 100% 1
9 VARLAMOV, MI , TURDAKOV, DY , (2016) A SURVEY OF METHODS FOR THE EXTRACTION OF INFORMATION FROM WEB RESOURCES.PROGRAMMING AND COMPUTER SOFTWARE. VOL. 42. ISSUE 5. P. 279 -291 14 93% 0
10 FAZZINGA, B , FLESCA, S , TAGARELLI, A , (2011) SCHEMA-BASED WEB WRAPPING.KNOWLEDGE AND INFORMATION SYSTEMS. VOL. 26. ISSUE 1. P. 127 -173 17 81% 3

Classes with closest relation at Level 1



Rank Class id link
1 14541 PAGERANK//GOOGLE MATRIX//LINK ANALYSIS
2 26870 LEARNING AGENTS//KNOWLEDGE ACQUISIT SHARING GRP//DOMAIN SPECIFIC ONTOLOGY
3 9301 WORDNET//WORD SENSE DISAMBIGUATION//SEMANTIC SIMILARITY
4 18217 LINKED DATA//JOURNAL OF WEB SEMANTICS//SPARQL
5 13552 XML//XPATH//KEYWORD SEARCH
6 27375 CONTENT ADAPTATION//AMBIENT NETWORKING//DYNAMIC CONTENT ADAPTATION
7 28184 SEMANTIC WEB SEARCH//NATURAL LANGUAGE INTERFACES TO DATABASES//SEMANTIC SEARCH ON THE WEB
8 11493 TEXT CATEGORIZATION//TEXT CLASSIFICATION//SPAM FILTERING
9 21920 WEB USAGE MINING//WEB LOG MINING//WEB ROBOT DETECTION
10 37715 SCI CORP INFORMAT SYST//3RD AGE//A NOISY TIME SERIES

Go to start page