File Marking Method

NOTE: View this help file full screen so that the text lines up properly. This is especially important for the content below.

 

The TXL Wizard File Marking and Recipe methods views input text as a collection of data tables (rows and columns). Columns separated by spaces can be parsed into the correct columns within Excel (or CSV). The user has considerable control over where each exported table's rows and columns begin and end. Tabulated data that often seem extremely difficult or almost impossible to properly enter into Excel or databases without human errors will now be easy, fast and error free.

 

This method involves inserting special markers within the text file to control the translation process.

 

Marker

Purpose

>>n

Table Start, where n is the table number

<<n

Table End

--------

Single Column Field.  Copies data to one Excel Column.

=========

Multiple Column Field. Copies data to many Excel columns

                     Summary of Markers Table

 

Using a combination of these simple markers, many complex data parsing tasks can be achieved.

How to Do It

To copy one or more tables within a text file to Excel/CSV using the File Marking  method, first open the file into your preferred text editor. Next insert two control lines before the top row of the data. The two control lines you should insert are the Table Start marker followed by Column Parsing Template. Then after the last data row of the table, insert the Table End Marker as shown below:

>>1                                       Table Start Marker (n=1)
------------ ------------- -----------    Column Parsing Template
xxxxxxxxxxxx yyyyyyyyyyyyy zzzzzzzzzzz    Text data to import to Excel/CSV
xxxxxxxxxxxx yyyyyyyyyyyyy zzzzzzzzzzz    Text data to import to Excel
/CSV
xxxxxxxxxxxx yyyyyyyyyyyyy zzzzzzzzzzz    Text data to import to Excel
/CSV
xxxxxxxxxxxx yyyyyyyyyyyyy zzzzzzzzzzz    Text data to import to Excel
/CSV
xxxxxxxxxxxx yyyyyyyyyyyyy zzzzzzzzzzz    Text data to import to Excel
/CSV
<<1                                       Table End Marker


NOTE: Blue text in Italics immediately above is not part of the data file. It is only there for instructional purposes.

 

Short Cut Markers

In the case where only a single type of field marker is required, you can use a short cut marker instead of typing many repeated "-" (dash) or "=" (equal) characters:

 

Single Field Only -

A single "-" at the beginning of the first column means all "-" to the end of line.  Sample 1 and Sample 2 below are identical:

Sample 1                                                   Sample 2

>>1                                                        >>1
-                                                          -------------------
aaaa  bbbb  cccc  dddd                                     aaaa bbbb cccc dddd
aaaa  bbbb  cccc  dddd                                     aaaa bbbb cccc dddd 
aaaa  bbbb  cccc  dddd                                     aaaa bbbb cccc dddd
aaaa  bbbb  cccc  dddd                                     aaaa bbbb cccc dddd 
<<1                                                        <<1

 

Multi-Field Only =

A single "=" at the beginning of the first column means all "=" to the end of line. Sample 3 and Sample 4 are identical:


Sample 3                                                   Sample 4
>>1                                                        >>1
=                                                          ==============
=====

aaaa bbbb cccc dddd                                        aaaa bbbb cccc dddd
aaaa bbbb cccc dddd                                        aaaa bbbb cccc dddd
aaaa bbbb cccc dddd                                        aaaa bbbb cccc dddd
aaaa bbbb cccc dddd                                        aaaa bbbb cccc dddd

<<1                                                        <<1

 



Example: Importing a text file to Excel or CSV

The following example illustrates how and where to use the file markers.

Assume you have a text file that looks something like this:

xxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxx yyyyyyy zzzz wwwwwwwwwwww
        yyyyyyy      wwwwwwwwwwww
xxxxxxx yyyyyyy zzzz wwwwwwwwwwww

xx yyyyyy zzzzzzzzzzzz wwwwwwwwwwwwww
xx        zzzzzzzzzzzz wwwwwwwwwwwwww
xx        zzzzzzzzzzzz wwwwwwwwwwwwww

It is possible to fit the above text (represented by the x's, y's, z's, and w's data columns) correctly into 3 tables. Each table may have different number of rows and columns.

The process is very simple:

Step 1: Edit the text file (or a copy of it) by inserting Markers to define the start and end of table rows and columns as follows:

 

FILE CONTENTS

COMMENTS

>>1

--------------------------

xxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxx

<<1

 

>>2

------- ------- ---- ------------

xxxxxxx yyyyyyy zzzz wwwwwwwwwwww

        yyyyyyy      wwwwwwwwwwww

xxxxxxx yyyyyyy zzzz wwwwwwwwwwww

<<2

 

>>3

-- ------ ------------ --------------

xx yyyyyy zzzzzzzzzzzz wwwwwwwwwwwwww

xx        zzzzzzzzzzzz wwwwwwwwwwwwww

xx        zzzzzzzzzzzz wwwwwwwwwwwwww

<<3

Table 1 start marker

Import as one Column

First data row

.

.

Last data row

Table 1 end marker

 

Table 2 start marker

Import as 4 Columns

First data row

.

Last data row

Table 2 end marker

 

Table 3 start

Import 4 Columns

First data row

.

Last data row

Table 3 end marker


Step 2: After saving the edited text file, launch TXL Wizard, select Use a marked up text file, choose your input and output file paths, then click Convert (or F5).

If converting to CSV a csv file is created with the appropriate data.

If converting to an Excel workbook, this will invisibly launch Microsoft Excel and copy the data into the correct cells. If the TXL Wizard option Clear sheet contents before processing check box is checked then the existing data in the output Excel file will be cleared before processing (formatting will be left intact).


In many cases, you can make the TXL Wizard automatically determine the columns. For example we can alternatively mark table 2 as follows:

>>2                                       Table 2 start
=====================================     Import as multiple (4) Columns
xxxxxxx yyyyyyy zzzz     wwwwwwwwwwww
    . First data row
yyyyyyy                  wwwwwwwwwwww
    .
xxxxxxx yyyyyyy zzzz     wwwwwwwwwwww
    . Last data row
<<2                                       Table 2 end

Sometimes you may need to mix the column markers to achieve a desired effect. For example:

>>2                                    Table 2 start
==================== -------------
    Creates 3 + 1 Columns
xxxxxxx yyyyyyy zzzz ww wwwwwwwwww     . First data row
        yyyyyyy      ww wwwwwwwwww
    .
xxxxxxx yyyyyyy zzzz ww wwwwwwwwww
    . last data row
<<2                               
     Table 2 end

This will import the data into 4 columns (3 columns for XYZ data, followed by one column for the "w" data in a single column. Note that the "w" data is not split into 2 columns since it has been marked with a single column marker "-" instead of the multiple column marker "=".

If you wish to import the spaces between the z's and the w's into a separate column in Excel, you should do this instead:

>>2                                        Table 2 start
==================== --- -------------
    Creates 5 Columns
xxxxxxx yyyyyyy zzzz     ww wwwwwwwwww 
 . First data row
        yyyyyyy          ww wwwwwwwwww 
 .
xxxxxxx yyyyyyy zzzz     ww wwwwwwwwww 
 . last data row
<<2                                   
     Table 2 end

 

Sometimes the columns are not separated by any spaces; rather they are determined by exact location. For example the table below:

     >>1
     -- -- ----
     MM/DD/YYYY
     01/09/2001
     02/15/2001
     02/17/2003
     .
     .
     <<1

As marked above, the second data row will copy "01/", "09/" and "2001" into Excel cells A2, B2, C2, respectively, with the slash marks being included.  When being used for exact location, the dash markings should be interpreted as meaning the exact number of characters represented by the dashes plus one additional character at the end of each column, with the exception of the last column.  The additional character is not automatically added onto the last column.  

If you wanted "01", "09" and "2001" without the slashes, do this:

     >>1
     -  -  ----
     MM/DD/YYYY
     01/09/2001
     02/15/2001
     02/17/2003
     .
     .
     <<1

 

Here is another example to illustrate this concept.  If your file markings on your input file look like this:

>>1
-    --   ---  ---- ----
1234/5678/1234/5678/1234
1234/5678/1234/5678/1234
1234/5678/1234/5678/1234
. .
. .
<<1
 

Your output in A1:E2 will look like this: 

12    567   1234   5678/   1234
12    567   1234   5678/   1234
. .