• No results found

5. INTERNAL STRUCTURE OF A WORD DOCUMENT _____________ 19

6.4. User interface

The program itself is self-contained in the three modules already explained above, since they contain the methods and functions to perform the error checking process from a Docx file and they can be called by any external script, or even from the command line. But, considering the program offers a variety of different errors to check, and some users may want to check some specific errors but not others, it was thought that a graphical user interface could improve the usability of the program.

Python is not the best language to develop native applications, of which an executable (.exe) in Microsoft Windows is an example, and these programs are also limited to specific operating systems, so a decision was made to use a web interface along with a micro server written in Python as a graphical user interface.

Compared to native applications, web interfaces are considerably simpler to develop, and often provide much better results for a fraction of the effort, all while being operating system agnostic (which means the software can run on any operating system).

Python has a variety of popular and well supported libraries to develop servers, including Django, one of the most popular web frameworks in the world, but most of them are more suited to larger projects more centered in the web-server aspect. For this reason, the library selected for this work was Flask, a python library dedicated to creating micro servers quickly and stably for small projects like this one.

The user interface module for this project consists of a single flask server that serves some html files through any normal browser and uses the same methods web pages use (GET, POST, forms) to communicate information between the user and the server.

One thing to remark is that even though Flask supports the creation of a production server that can maintain indefinite uptime, its configuration would require more in-depth knowledge about security certificates from the program user than was deemed acceptable, so the intended use, as reflected in the instruction section down below, uses the development server mode provided by Flask.

This mode is not safe for operation through the internet, and should only be used locally, but the program can be used while connected to the internet – it just cannot be set up as a remotely accessed web page without configuring a WSGI server, something which escapes the design criteria of the program and which will not be discussed in this work.

This module is the only part of the program which contains non-python code, as the html files displayed to the user required some JavaScript code to work correctly. Nonetheless, as the rest of the program is written in Python (as is the entirety of the core program), this author considers the program as compliant with the design criteria of developing an error checking tool in Python.

Finally, Figure 10 shows the graphical user interface as seen in Google Chrome, though it should look very similar if not identical in any other modern browser that supports JavaScript.

Figure 10 shows the first page of the web interface the user sees after launching the development server according to the program instructions.

The user must select a Docx file to be examined, with the only limitation being it must weigh less than 100 MB in order to limit the processing time to reasonable lengths. Once the file is selected, the user will see a change in the interface, as can be seen in Figure 11 – and then the user only needs to click the Submit button to proceed.

Figure 10. File selection page.

After clicking the Submit button, the user will be presented with the check selection page, shown in Figure 12. Any combination is valid, though different options may imply a longer processing time.

Figure 11. File selection page with file selected.

Figure 12. Check selection page.

After selecting the options and clicking on the Start button, the program will start and the page will display some kind of loading sign – the appearance of which depends on the specific browser – on the tab.

When the program finishes the analysis, it writes the error report and automatically forwards the user to the final page of the interface, as shown in Figure 13.

If the user clicks on the Download Report button, the error report will be automatically downloaded to the user’s default download folder – which can be set in the browser configuration menu. If the Reset Program button is clicked, the user will be taken back to the starting page and the data stored in the local server will be deleted.

Figure 13. Final page: Download and Reset.

7. Analysis of an Error Report

In this section, an example of an Error Report produced by passing a TFG report through the program (with all possible checks) will be analyzed in detail to explain how to read and interpret the program results.

The used TFG report to be checked was provided by my TFG supervisor for testing purposes and it is used here as an archetypical example. Some TFG reports may have different results to the ones shown in this section, especially if they use different citation schemes.

The order of appearance of the results of each check is predetermined and it will always be the same as long as the check is selected and the program finds some error. If any of these conditions is not true for any particular check, the order will still be the same, just without that particular check.

The first page of the error report consists of an error summary, as shown in Figure 14, that explains how the error report is structured, how many errors each check has found and some of the caveats that may limit the usefulness of the provided data.

Figure 14. Error report summary.

If any of the error checks encounters an execution error and the check cannot proceed, the program will log the error, proceed with the other selected checks and notify the user with a warning on this same first page of the error report, as can be seen in Figure 15.

After the summary page, the first errors shown are the missing section errors, as can be seen in Figure 16.

In this particular case, the glossary and budget sections could not be found. The error message already explains that these could be false negatives/alarms, but it could also be that this particular TFG report does not need these sections.

The next section contains the citation errors – both the errors in the Bibliography or text being in the wrong order and references to those sources being not present in the text. The specific error message can be seen in Figure 17.

Figure 16. Missing sections errors.

Figure 15. Alert message for checks with execution errors.

In this TFG report, the order of the numbered citations in the Bibliography section is correct, so the citation order error section is not present in the error report. However, the program has not found references in the text for some or all of the sources in the Bibliography so it will include the citation references missing from text section in the error report as seen in Figure 17.

After this, the next sections will involve errors in Figures and Table captions, in this order. First, the error report will show how many images it has found without a caption, as seen in Figure 18. Only a number is given because the internal document names for each picture in Docx document are not human-readable and thus it is very difficult to name a specific picture without a caption in a way that would be unequivocal to a reader. The number is sufficient to give the program user an idea of the magnitude of the error and get them to revisit the document with an eye directed to correcting these errors.

Next, the error report will list the missing labels on captions for figures. That is, those captions it has found that do not include Fig. or Figure in the text box. These kinds of errors, and those that come after these, start to occupy more and more page space in the error report and showing just one or a few errors of each section is enough to understand how the error reports works.

Figure 17. Citation errors.

Figure 18. Missing figure captions errors.

An error, along with the heading for this error section and an explanation of what the program is checking, can be seen in Figure 19.

Each error in this section is identified by its subtitle – in this case, Figure caption 4, which denotes this error was found in the fourth figure with a caption in document order.

The last error sub section inside the Figure caption errors section are the missing references to figures. Some examples of which can be seen in Figure 20.

The errors for Table captions appear next in the error reports, but the structure is identical to the figure caption error section explained above so it is not necessary to explain it in detail.

The last big section contains the spelling, grammatical and style errors found in the sections of the document selected by the user in the user interface.

First, the errors in the main text of the document will be shown, divided into sub sections using the TFG report own headings. A few of these errors can be seen in Figure 21.

Figure 19. Example of a missing figure label error.

Figure 20. Examples of some missing references to figures errors.

The rest of the error report is made up of the spelling, grammatical and style errors found inside the tables, text boxes, headers and footers, if those checks were selected. The format will be the same as with the errors in the main text, so no figures or detailed explanations are needed.

Figure 21. Examples of some spelling and grammatical errors.

8. Program instructions

In this section, instructions will be provided for both installing and using the program. The installation process only needs to be run once as long as no folders, files or Python itself changes. The program will work for any operating system capable of running Python 3.7 and Java 8 or above, but the instructions provided will only be for the most common operating systems: Microsoft Windows and Linux.

Related documents