[Setup] SFTP Guidelines

SFTP import

Our platform offers an SFTP file import feature for structured data. You can use this import feature for one-shot imports but also for recurring automated file ingestion.
You just have to prepare your files according to the following guidelines and drop them into a specific SFTP folder.

File Guidelines

When preparing your files, please follow these recommendations:
  1. We support the following file formats:
      • CSV
      • JSON
  1. UTF-8 or UTF-8 BOM encoding must be used.
  1. Your file name must abide by our pattern and include the scope, subsection, and date.
  1. Each file should preferably contain data from a single business entity or "scope".

File formats

Comma-separated values (CSV) are the most commonly used format for importing files.
CSV files are text files organized in columns and rows, similar to tables. In fact, the columns are separated by a separator (semicolon ; by default, but any other is accepted), and each row of the file is a line.
We also support JSON files imports through SFTP.
πŸ’‘
Any other file format will be considered non-standard and must be discussed in pre-sales.

UTF-8 encoding

Your file must be saved with UTF8 encoding (with or without BOM).
Thanks to this encoding, all characters of different languages can coexist in the same file, such as French and Chinese.

Business entities (scopes)

Whenever possible, each file should contain data for a single business entity.
Please find below a short description of available scopes:
  • Contacts (contacts) - individuals, customers, or contacts in your database;
  • Consents (consents) - consents from individuals to receive communications, usually attached to an email address, phone number, or device ID;
  • Stores (stores) - stores and e-commerce sites;
  • Products (products) - products in your catalog;
  • Tickets (orders) - transactions completed or abandoned carts created by customers;
  • Ticket lines (ordersitems) - all items belonging to an order or abandoned carts (e.g. individual items on a ticket)
  • Events (events)- events attached to an identified contact such as marketing interactions (clicks, opens) or web browsing logs.
πŸ’‘
Cases of multi-scoped files (e.g. consents in contact files, or ordersitems nested in orders) should be discussed with your Project Manager so that we can adapt our data processing to your constraints.

File naming rules

Our platform relies on a specific naming convention to select files for import. You must follow these rules to ensure that the files are processed correctly:
  • All files must be named according to the following scheme: <entity>__<source>_<subsection>_<date>.csv
  • Elements surrounded by <> must be replaced with a real value and are required.
    • <entity> is one of the business entities described above;
    • <source> is the name of system where the data comes from (e.g. prestashop, pos, etc.) ;
    • <subsection> is an optional parameter to used to provide additional information. For example we can have two files of contacts for a single source contacts_prestashop_customers and contacts_prestashop_subscribers.
    • <date> is the date of the day when the file was created. The date must follow the format YYYYMMDD, such as 20220315.
For example, if you have a contacts file from Salesforce without optional subsection, it should be named contacts_salesforce_20220315.csv.
πŸ’‘
The file name can optionally be extended with additional information at the beginning (called a prefix) and directly before the .csv extension (a suffix). Prefixes and suffixes are optional.

File contents

Required columns

  1. Primary key of the source
Each file must contain:
  • or a set of columns that, combined, will form a unique identifier for each record.
In case some records have a null value in the key, they won’t be imported.
  1. Date of last update
Each file must contain a column indicating the last modification date of the records. This information is essential for the proper functioning of the CDP (incremental changes detection).
  • In case you have such a column but some records have a null value in it, we can fill it with the max value for your batch of records.
  • In the absence of such a column, the date in the file name (YYYMMDD format) will be used.
The two above-mentioned bypasses often come with drawbacks, especially when comparing the "freshness" of data from several sources for a single record.

CSV Header Line

The first line of the files should always be the header. We use this header to assess the number of columns in the file and to map its content during import.
Preferably, values in the header should be in lowercase, without spaces (use _ to replace them) or special characters, and enclosed in double quotes ".

Lines in a CSV file

All lines in the file must contain the same number of columns as the header.
Each line of the file must contain only one record.
All text values must be enclosed by double quotes ".
⚠️
Avoid as much as possible multi-line data that could alter the line-by-line structure of the file and prevent it from being properly imported.
If your file contains values that span on several lines (due to line breaks), it is absolutely necessary that these values are surrounded by double quotes ".
If a value is null in your source system, no value should be passed to the file between the two columns separators will touch, regardless of the data type (e.g. ;;).

Accepted value types

The platform accepts the following data types: a plain text string, JSON, number, date, boolean, or list of values.
Of course, it is possible to pass any type of value as text and then cast/manipulate it within the platform, but we strongly recommend that the data be passed in the correct format to facilitate business processing.

Example

Mike Cole received a file of contacts to import. The file is small, so he decides to check it. He opens it in his text editor:
plain text
"id";"json";"text";"date";"bool";"number";"array" "id_1";"{""name"":""Jean""}";"Hello";"2022-01-01 00:01:00";1;1.56;value1,value2 "id_2";"{""name"":""Michel""}";"Hi";"2022-01-31 19:01:00";;-5.8;value1 "id_3";"{""name"":""Marie""}";"Hola";"2022-01-31 07:01:00";0;99.156;
Note: the columns Id and last modification date are mandatory, see above.
⚠️
Since all text values must use double quotes " as enclosures, any double quote " inside a text stringsmust also be properly escaped.
This escaping should be done by adding a double quote as in the example below:
  • Raw data:Β simple "test
  • Correctly escaped and enclosed data:Β "simple ""test"

SFTP repository

Your Project Manager will provide you with the credentials of your SFTP repository. To manually upload test files, if you don't know how to set up an FTP client yourself, ask an IT specialist at your company.
Then, if you find that your repository is empty after you log in, contact your Project Manager for help. Your automatic import feature may still need to be activated.

Directory structure

The FTP repository contains a specific structure of directories used by the import automation.
The raw directory is where you should put all the files you want to import during the run phase.
During the setup phase, you can share your files with the project team through the samples folder for the first few exchanges, and then the init folder once the model of each file has been stabilized.
The imports and exports directories are for the exclusive use of the platform.

Import mode

We also support JSON file imports through SFTP.