State of knowledge

July 2023

Observation

If the ODBC replicator with the Microsoft Access text driver is used instead of the text replicator, there may be a problem with the character encoding. For example, umlauts are displayed incorrectly in the replicated dataset.

Reason

File format

The text file to be replicated is not in the ANSI or OEM formats supported by the driver by default, but in UNICODE (UTF-16) or UTF-8, for example.

This is where MetaDirectory Text Replicator is better positioned:

Example screenshot: Format selection in text replicator


Example screenshot: Format selection in ODBC MS Access Text Driver

Possible solutions

Define text format

If you switch to OEM under Define text format (image above), a file named schema.ini is created in the same directory as the file to be replicated.
Information about the structure can be found athttps://learn.microsoft.com/en-US/sql/odbc/microsoft/schema-ini-file-text-file-driver?view=sql-server-ver16

Example screenshot: schema.ini


This contains among other things beside the file name of the file to be replicated also an information about the character set under CharacterSet.
In addition to ANSI, OEM and UNICODE, the value 65001 for UTF-8 can also be entered here..

schema.ini

[orginal.csv]
ColNameHeader=True
Format=Delimited(|)
MaxScanRows=25
CharacterSet=65001
CODE

Use a different driver

Alternatively, you can use another driver: Microsoft.Jet.OLEDB.4.0