Lesson 12: Textsearch and Autocoding

There are two similarly designed features provided by ATLAS/ti to partially automate coding-operations or the search for text-fragments.

It is assumed that you already know how to define a string to look for within the text when working with this function. There must always be a formal, unique indicator to locate. However it is not necessary to make a strict categorical commitment (for example when "autocoding"), that a certain concept is only displayed through certain formal indicators. The search for such a string is somewhat a heuristic means that can help you to recognize specific text-fragments without having to identify all these fragments. This technique will not free you from reading the text but it partially speeds up the task of locating and coding a part of the found positions.

Search-features

A more simple approach is the simple search-feature for strings. This feature may be easily accessed as the second item from the top in the vertical button bar of the primary text-window. When clicking the icon, a prompt asking you to enter a string is displayed. Now you must carefully decide whether to enter exactly one word or multiple words or portions of a word (for example the invariable word-stem to look up all variations of the word at the same time: "cogni" finds "recognize", "cognition", "cognitive" and so on). If the stem is too short, the likelihood to receive irrelevant results is generally higher ("enster" in German finds all kinds of windows (Fenster) but also "Gespenster" (ghosts)). If on the other side the strings are too long (especially strings with multiple words), you will not likely find all of these expressions because there might partially be a line-break between them. It is also important to always first position the cursor at the beginning of a text when searching it as the search always begins at the cursor position. There is no such like a go-backwards- feature to go to back to the beginning of a text from the end as implemented in the search-feature of WinWord. By default ATLAS/ti uses this feature to search the current text. When a search is being initiated, you are prompted whether to additionally search all primary-texts. If you have located a text-segment containing the search-item with the help of the search-function, you can easily enable the PD-window, mark the desired quotation-borders and code the passage to then return back to the search-function and look up the next passage matching the search-item. If you do not need the search-window for looking up additional matching found positions, you may alternatively use the "Find again" function from the context-menu. You may also decide whether to distinguish between uppercase and lowercase characters similar to the search-feature of WinWord or other comfortable word processing applications. Clicking the "case sensitive" checkbox facilitates this.

Search-expressions

There is often not only one but a number of formal indicators that can point to a particular concept. It would therefore be helpful to be able to search for all of these items in a single stroke.

For this purpose, ATLAS/ti allows you to define search expressions that may consist of entire collections of search-items. These search-items may then again be reduced to their stem by utilizing the asterisk (*) as a general placeholder. The Autocoding-window contains English-language examples on this strategy (the same sample search expressions are also to be found in the floating list of the search-menu). Search expressions consist of two parts: a) a name for the search swarn (to be entered in capital letter) and b) the list of the terms to look up. The separators between strings within the search expression are vertical dashes ("|"; ASCII-character 124) Here is an example:

"WORK=:$division of labor|*work|Job*|occupatio*|task"
The search-item inserted first after the title prefixed with a dollar sign represents another search expression, which is in this case somehow being integrated as part of the new search expression named WORK. The dollar sign is the indicator for that. Terms that have a star at the beginning and at the end are found as word-parts of longer compound words ("night-work") in upper- or lowercase writing, even without the star (e.g.: "work", "Work"). You can save a search expression via the "options" button (by default this is "srchbib.skt" in the ATLAS/ti application-directory; other files may be created). This will also allow you to inversely load a user-defined file containing search expressions before beginning the search. By default the content of the "srchbib.skt" file is loaded and can also be modified (see the respective note in the section on the relationship-editor on p. 26 as well as the question to transfer modified search expressionss to other HU's.)

Searching with GREP

An elaborated version of the text-search is to search by way of so-called GREP-expressions. This type of search-function offers a portion of the expressions used in the "Regular Expressions" language for searching within a text. The basic principle is the integration of control-elements with which formal definition-characteristics, exceeding the complexity of a simple string, within the searched text-passages may be expressed. For example you can use GREP to look up text-passages given in brackets or to specify that the searched string is to be retrieved only if the string is located at the beginning of a line. The GREP-expressions offered by ATLAS/ti include the following characters:

Autocoding

The more complex version of the search-function is by far the Autocoding-function that is accessed through the context-menu of the code-list. It allows you to search for strings and makes sure that the particular strings or a defined environment of these strings are coded with a previously defined code. The recently active code is set as the code. However the selection may be modified at any time through the code-list window integrated into the autocoding function.

You can optionally specify whether a) to also code the sentence or the paragraph besides the string and b) whether to relate the search to the active primary-text, to the PD-family or to all PD's.

ATLAS/ti handles sentences from period to period. If a period is used within a sentence (as with abbreviations) the period is interpreted as a line-begin or line-end in effect. Paragraphs are handled as a separate unit when two paragraph-marks (optical or a blank line) are used. If only one paragraph-mark is existing then the subsequent or previous paragraph up to the next blank line is also marked and coded.

Also you may choose to confirm each section one by one or to skip it rather than having all found positions coded automatically.