creating corpora
Предварительный просмотр (5 стр.)
Прокрутите вниз 👇
О "creating corpora"
creating corpora creating corpora a corpus is a structured collection of texts used for linguistic research, natural language processing (nlp), and other language-related tasks. purpose and planning - clearly defining the corpus's purpose determines scope and type of data. for example, general corpora (like british national corpus) include various genres and topics, while specialized corpora focus on specific domains or time periods. - planning includes deciding size, balance (equal representation of genres or periods), and language varieties (dialects, formal/informal). data collection methods - manual collection: gathering texts personally or from libraries, ensuring quality and relevance. - web scraping: automated tools to collect web texts; needs filtering and ethical considerations. -...
Этот файл содержит 6 стр. в формате PPTX (709,4 КБ). Чтобы скачать "creating corpora", нажмите кнопку Telegram слева.