English Language and Linguistics

 Digital English:
3 perspectives from Data to Research and Teaching

1. The term digital English is ambiguous: it can refer to the language as such which is normally available in digital form today anyway.

1.1. The spoken versions of English are usually transmitted in digital form nowadays, irrespective of whether they are transmitted via the internet or the telephone. In face-to-face communication the spoken language can be digitalized through modern recording devices, the language quality depends on the requirements of further processing and analysis. The spoken language compiled in this way should also be made available in its raw form to future researchers for further analysis or for verifying the research reported. This can be done in modern government archives (as in the US Proceedings of the National Academy of Sciences) or in new data initiatives like CLARIN.

1.2. Written language data today are also usually available in electronic form, either because they are transmitted in electronic versions or they are even produced in electronic form which is clearly the majority of texts today.

A lot of current-day materials are available on the internet, where selection however is often a problem, so that the compilation of research-specific text databases is not always easy.

2. English digital research today is digital in all stages of research from the data collection to the management of processing and analysis to the final dissemination.

2.1. Tools are available for downloading data from the web range from HTTrack for the collection of electronic newspapers to ESprit for the collection of social media texts, e.g. from Twitter.

2.2. For the data management today, a good stratification plan depends on the variables that will be used in processing and analysing the data later. The stratified compilation of academic writing, for instance includes the variable academic level (experience and age), university or national background and gender, for instance. Today this can be done in an Excel sheet or Access database so that each addition to the system makes the processing obvious.

After the first compilation stage, a brief survey of the data makes manifest, where data cleaning is necessary, either because the spoken data were corrupted by noises on the certain frequency or because the written data was corrupted by technical transmission or by unusual writing circumstances. I any case data cleansing is a necessity before we want to proceed to the next step in data management that is the annotation and tagging according to specific project needs. This may lead directly to the next step of data processing which can be done electronically offline or online when tools can either be downloaded or are available on the worldwide web. Such offline tools include Antconc and WordSmith, whereas online tools range from simple internet dictionaries to megacorpora like LOB.

2.3. The dissemination of research today has changed dramatically: Even traditional publishers today have developed their e-journals and new forms of presentations are made available either because the research articles are directly related to previous publications (references or the data archives so that the research processing can be followed more immediately than previously. The availability of books in electronic format also made it easier to publish research results from PhD theses and even MA and BA theses in university libraries as part of university computer centres today.

New more researcher dominated dissemination options have been made available through social dissemination platforms recently. As most of these are commercial enterprises that were found over the last ten years the “market” is not very standardized yet and researchers have to decide for themselves whether they find Research Gate uses too aggressive marketing strategy, for the is only used as a cheap alternative by young researchers who not have access to the “real” research in the current top e-journals. In any case the type of dissemination is an interesting variable in deciding and discussing the value of research for future projects.

3. The digital research process today is greatly facilitated by tools that have been deveoped for the researcher as a simple user and as a developer.

3.1. Tools for text processing can either be downloaded for free from the internet (like Antconc) or acquired through traditional publishers (like WordSmith).

3.2. Digital data collection today is not only restricted to the record of spoken and written data in texts, but data can also be produced digitally today, e.g. by using Eye-Tracking technology that records the gaze and time/movements of readings on the screen, which allows us to draw conclusions on the processing of texts or words which can again show us the difficulty of phrases for specific readerships in academic writing or the indigenization of New English phrases in social linguistics or global English research.

4. The teaching component of digital English can be conceptualized in a similar way as a research component because it starts with knowledge collection, continues with knowledge management and processing, then finishes with dissemination. For instance, students can use the world-wide web to read the explanations in Wikipedia or to watch MOOCs or to contextualize and apply them to their university or national context situations and when disseminating results equally by producing Wikipedia entries or MOOCs that show the discourse that has led to the advancement of science – ideally. Digital English teaching can be in the traditional instructional setting from teacher to student that it can also include a more modern component of student learning.

The digital English research process is summarzed in the following Image: