Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.
URL: http://code.google.com/p/lapdftext/
Proper Citation: lapdftext (RRID:SCR_006167)
Description: Software that facilitates accurate extraction of text from PDF files of research articles for use in text mining applications. It is intended for both scientists and natural language processing (NLP) engineers interested in getting access to text within specific sections of research articles. The system extracts text blocks from PDF-formatted full-text research articles and classifies them into logical units based on rules that characterize specific sections. The LA-PDFText system focuses only on the textual content of the research articles. The current version of LA-PDFText is a baseline system that extracts text using a three-stage process: * identification of blocks of contiguous text * classification of these blocks into rhetorical categories * extraction of the text from blocks grouped section-wise.
Abbreviations: lapdftext, LA-PDFText,
Synonyms: Layout-Aware PDF Text Extraction, Layout-Aware Text Extraction from Full-text PDF of Scientific Articles, lapdftext: Layout-Aware Text Extraction from Full-text PDF of Scientific Articles
Resource Type: software resource, software application, text extraction software
Defining Citation: PMID:22640904
Keywords: text mining, pdf, text extraction, natural language processing
Expand Allis listed by |
|
has parent organization |
We found {{ ctrl2.mentions.all_count }} mentions in open access literature.
We have not found any literature mentions for this resource.
We are searching literature mentions for this resource.
Most recent articles:
{{ mention._source.dc.creators[0].familyName }} {{ mention._source.dc.creators[0].initials }}, et al. ({{ mention._source.dc.publicationYear }}) {{ mention._source.dc.title }} {{ mention._source.dc.publishers[0].name }}, {{ mention._source.dc.publishers[0].volume }}({{ mention._source.dc.publishers[0].issue }}), {{ mention._source.dc.publishers[0].pagination }}. (PMID:{{ mention._id.replace('PMID:', '') }})
A list of researchers who have used the resource and an author search tool
A list of researchers who have used the resource and an author search tool. This is available for resources that have literature mentions.
No rating or validation information has been found for lapdftext.
No alerts have been found for lapdftext.
Source: SciCrunch Registry