Textual data science with R (Record no. 822)
[ view plain ]
000 -LEADER | |
---|---|
fixed length control field | 08586nam a22002417a 4500 |
005 - DATE AND TIME OF LATEST TRANSACTION | |
control field | 20210129144305.0 |
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION | |
fixed length control field | 210129b ||||| |||| 00| 0 eng d |
020 ## - INTERNATIONAL STANDARD BOOK NUMBER | |
International Standard Book Number | 9781138626911 |
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER | |
Classification number | 401.410285555 |
Item number | BEC |
100 ## - MAIN ENTRY--PERSONAL NAME | |
Personal name | Becue-Bertaut, Monica |
245 ## - TITLE STATEMENT | |
Title | Textual data science with R |
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT) | |
Name of publisher, distributor, etc. | CRC Press |
Place of publication, distribution, etc. | Boca Raton |
Date of publication, distribution, etc. | 2018 |
300 ## - PHYSICAL DESCRIPTION | |
Extent | xvii, 194 p. |
365 ## - TRADE PRICE | |
Price type code | GBP |
Price amount | 59.99 |
490 ## - SERIES STATEMENT | |
Series statement | Chapman & Hall/CRC Computer Science & Data Analysis |
500 ## - GENERAL NOTE | |
General note | <br/> |
504 ## - BIBLIOGRAPHY, ETC. NOTE | |
Bibliography, etc. note | Table of Contents<br/>1. Encoding: from a corpus to statistical tables<br/><br/>Textual and contextual data<br/><br/>Textual data<br/><br/>Contextual data<br/><br/>Documents and aggregate documents<br/><br/>Examples and notation<br/><br/>Choosing textual units<br/><br/>Graphical forms<br/><br/>Lemmas<br/><br/>Stems<br/><br/>Repeated segments<br/><br/>In practice<br/><br/>Preprocessing<br/><br/>Unique spellings<br/><br/>Partially-automated preprocessing<br/><br/>Word selection<br/><br/>Word and segment indexes<br/><br/>The Life UK corpus: preliminary results<br/><br/>Verbal content through word and repeated segment indexes<br/><br/>Univariate description of contextual variables<br/><br/>A note on the frequency range<br/><br/>Implementation with the Xplortext package<br/><br/>In summary<br/><br/>2. Correspondence analysis of textual data<br/><br/>Data and goals<br/><br/>Correspondence analysis: a tool for linguistic data analysis<br/><br/>Data: a small example<br/><br/>Objectives<br/><br/>Associations between documents and words<br/><br/>Profile comparisons<br/><br/>Independence of documents and words<br/><br/>The X2 test<br/>Association rates between columns and words<br/><br/>Active row and column clouds<br/><br/>Row and column pro_le spaces<br/><br/>Distributional equivalence and the X2 distance<br/><br/>Inertia of a cloud<br/><br/>Fitting document and word clouds<br/><br/>Factorial axes<br/><br/>Visualizing rows and columns<br/><br/>Category representation<br/><br/>Word representation<br/><br/>Transition formulas<br/><br/>Superimposed representation of rows and columns<br/><br/>Interpretation aids<br/><br/>Eigenvalues and representation quality of the clouds<br/><br/>Contribution of documents and words to axis inertia<br/><br/>Representation quality of a point<br/><br/>Supplementary rows and columns<br/><br/>Supplementary tables<br/><br/>Supplementary frequency rows and columns<br/><br/>Supplementary quantitative and qualitative variables<br/><br/>Validating the visualization<br/><br/>Interpretation scheme for textual CA results<br/><br/>Implementation with Xplortext<br/><br/>Summary of the CA approach<br/><br/>3. Applications of correspondence analysis<br/><br/>Choosing the level of detail for analyses<br/><br/>Correspondence analysis on aggregate free text answers<br/><br/>Data and objectives<br/><br/>Word selection<br/><br/>CA on the aggregate table<br/><br/>Document representation<br/><br/>Word representation<br/><br/>Simultaneous interpretation of the plots<br/><br/>Supplementary elements<br/><br/>Supplementary words<br/><br/>Supplementary repeated segments<br/><br/>Supplementary categories<br/><br/>Implementation with Xplortext<br/><br/>Direct analysis<br/><br/>Data and objectives<br/><br/>The main features of direct analysis<br/><br/>Direct analysis of the culture question<br/><br/>Implementation with Xplortext<br/><br/>4. Clustering in textual analysis<br/><br/>Clustering documents<br/><br/>Dissimilarity measures between documents<br/><br/>Measuring partition quality<br/><br/>Document clusters in the factorial space<br/><br/>Partition quality<br/><br/>Dissimilarity measures between document clusters<br/><br/>The single-linkage method<br/><br/>The complete-linkage method<br/><br/>Ward's method<br/><br/>Agglomerative hierarchical clustering<br/><br/>Hierarchical tree construction algorithm<br/><br/>Selecting the final partition<br/><br/>Interpreting clusters<br/><br/>Direct partitioning<br/><br/>Combining clustering methods<br/><br/>Consolidating partitions<br/><br/>Direct partitioning followed by AHC<br/><br/>A procedure for combining CA and clustering<br/><br/>Example: joint use of CA and AHC<br/><br/>Data and objectives<br/><br/>Data preprocessing using CA<br/><br/>Constructing the hierarchical tree<br/><br/>Choosing the final partition<br/><br/>Contiguity-constrained hierarchical clustering<br/><br/>Principles and algorithm<br/><br/>AHC of age groups with a chronological constraint<br/><br/>Implementation with Xplortext<br/><br/>Example: clustering free text answers<br/><br/>Data and objectives<br/><br/>Data preprocessing<br/><br/>CA: eigenvalues and total inertia<br/><br/>Interpreting the first axes<br/><br/>AHC: building the tree and choosing the final partition<br/><br/>Describing cluster features<br/><br/>Lexical features of clusters<br/><br/>Describing clusters in terms of characteristic words<br/><br/>Describing clusters in terms of characteristic documents<br/><br/>Describing clusters using contextual variables<br/><br/>Describing clusters using contextual qualitative variables<br/><br/>Describing clusters using quantitative contextual variables<br/><br/>Implementation with Xplortext<br/><br/>Summary of the use of AHC on factorial coordinates coming from CA<br/><br/>5. Lexical characterization of parts of a corpus<br/><br/>Characteristic words<br/><br/>Characteristic words and CA<br/><br/>Characteristic words and clustering<br/><br/>Clustering based on verbal content<br/><br/>Clustering based on contextual variables<br/><br/>Hierarchical words<br/><br/>Characteristic documents<br/><br/>Example: characteristic elements and CA<br/><br/>Characteristic words for the categories<br/><br/>Characteristic words and factorial planes<br/><br/>Documents that characterize categories<br/><br/>Characteristic words in addition to clustering<br/><br/>Implementation with Xplortext<br/><br/>6. Multiple factor analysis for textual analysis<br/><br/>Multiple tables in textual analysis<br/><br/>Data and objectives<br/><br/>Data preprocessing<br/><br/>Problems posed by lemmatization<br/><br/>Description of the corpora data<br/><br/>Indexes of the most frequent words<br/><br/>Notation<br/><br/>Objectives<br/><br/>Introduction to MFACT<br/><br/>The limits of CA on multiple contingency tables<br/><br/>How MFACT works<br/><br/>Integrating contextual variables<br/><br/>Analysis of multilingual free text answers<br/><br/>MFACT: eigenvalues of the global analysis<br/><br/>Representation of documents and words<br/><br/>Superimposed representation of the global and partial configurations<br/><br/>Links between the axes of the global analysis and the separate analyses<br/><br/>Representation of the groups of words<br/><br/>Implementation with Xplortext<br/><br/>Simultaneous analysis of two open-ended questions: impact of lemmatization<br/><br/>Objectives<br/><br/>Preliminary steps<br/><br/>MFACT on the left and right: lemmatized or nonlemmatized<br/><br/>Implementation with Xplortext<br/><br/>Other applications of MFACT in textual analysis<br/><br/>MFACT summary<br/><br/>7. Applications and analysis workflows<br/><br/>General rules for presenting results<br/><br/>Analyzing bibliographic databases<br/><br/>Introduction to the lupus data<br/><br/>The corpus<br/><br/>Exploratory analysis of the corpus<br/><br/>CA of the documents _ words table<br/><br/>The eigenvalues<br/><br/>Meta-keys and doc-keys<br/><br/>Analysis of the year-aggregate table<br/><br/>Eigenvalues and CA of the lexical table<br/><br/>Chronological study of drug names<br/><br/>Implementation with Xplortext<br/><br/>Conclusions from the study<br/><br/>Badinter's speech: a discursive strategy Methods<br/><br/>Breaking up the corpus into documents<br/><br/>The speech trajectory unveiled by CA<br/><br/>Results<br/><br/>Argument flow<br/><br/>Conclusions on the study of Badinter's speech<br/><br/>Implementation with Xplortext<br/><br/>Political speeches<br/><br/>Data and objectives<br/><br/>Methodology<br/><br/>Results<br/><br/>Data preprocessing<br/><br/>Lexicometric characteristics of the speeches and lexical table coding<br/><br/>Eigenvalues and Cramér's V<br/><br/>Speech trajectory<br/><br/>Word representation<br/><br/>Remarks<br/><br/>Hierarchical structure of the corpus<br/><br/>Conclusions<br/><br/>Implementation with Xplortext<br/><br/>Corpus of sensory descriptions<br/><br/>Introduction<br/><br/>Data<br/><br/>Eight Catalan wines<br/><br/>Jury<br/><br/>Verbal categorization<br/><br/>Encoding the data<br/><br/>Objectives<br/><br/>Statistical methodology<br/><br/>MFACT and constructing the mean configuration<br/><br/>Determining consensual words<br/><br/>Results<br/><br/>Data preprocessing<br/><br/>Some initial results<br/><br/>Individual configurations<br/><br/>MFACT: directions of inertia common to the majority of groups<br/><br/>MFACT: representing words and documents on the first plane<br/><br/>Word contributions<br/><br/>MFACT: group representation<br/><br/>Consensual words<br/><br/>Conclusion<br/><br/> |
520 ## - SUMMARY, ETC. | |
Summary, etc. | Book Description<br/><br/>Textual Statistics with R comprehensively covers the main multidimensional methods in textual statistics supported by a specially-written package in R. Methods discussed include correspondence analysis, clustering, and multiple factor analysis for contingency tables. Each method is illuminated by applications. The book is aimed at researchers and students in statistics, social sciences, history, literature, and linguistics. The book will be of interest to anyone from practitioners needing to extract information from texts to students in the field of massive data, where the ability to process textual data is becoming essential. |
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM | |
Topical term or geographic name as entry element | Computational linguistics |
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM | |
Topical term or geographic name as entry element | Discourse analysis--Statistical methods |
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM | |
Topical term or geographic name as entry element | R (Computer program language) |
942 ## - ADDED ENTRY ELEMENTS (KOHA) | |
Source of classification or shelving scheme | Dewey Decimal Classification |
Koha item type | Book |
Withdrawn status | Lost status | Source of classification or shelving scheme | Damaged status | Not for loan | Collection code | Bill No | Bill Date | Home library | Current library | Shelving location | Date acquired | Source of acquisition | Cost, normal purchase price | Total Checkouts | Total Renewals | Full call number | Accession Number | Date last seen | Date checked out | Copy number | Cost, replacement price | Price effective from | Koha item type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Dewey Decimal Classification | IT & Decisions Sciences | IN29488 | 28-01-2021 | Indian Institute of Management LRC | Indian Institute of Management LRC | General Stacks | 01/29/2021 | Overseas Press India Private | 4103.74 | 4 | 2 | 401.410285555 BEC | 001092 | 04/15/2024 | 12/19/2023 | 1 | 6124.98 | 01/29/2021 | Book |