Treasure Data's primary idea portal. 

Submit your ideas & feature requests directly to our product requirements team! We look forward to hearing from you.

S3 connector file name pattern support

Currently DataConnector S3 accepts bucket and prefix for input data file.
Specify import files with pattern not prefix, like YYYYMMDD_filename.tsv, is more controllable

  • Ryutaro Yada
  • Jun 10 2016
  • Shipped
  • Ryutaro Yada commented
    June 10, 2016 23:33

    Hello Rob,

    I confirmed it works with data connector.
    Thank you for the support!

  • Ryutaro Yada commented
    June 10, 2016 23:33

    Currently we can only specify forward match pattern at config:in:path_prefix in load.yml. So we have to separate bucket or use some same forward pattern for multiple files for a load operation.

    <current config>
    - path_prefix: bucket_name
    -->we have to create and manage so many bucket
    - path_prefix: bucket_name/fielprefix
    --> imports bucket_name/fielprefix* files.

    <expected config>
    - path_prefix: bucket_name/*_postfix.tsv
    --> allow backward match
    - path_prefix: bucket_name/*_patern.*
    --> middle pattern match

    This flexibility in file specification is strongly related to this feedback .

    Dataconnector operation is straight forward and required formal way. That prevent customer from moving to dataconnector from td import. Because many customers are now control their workflow with some batch script way. td import has executing parameter, so it works easer with such batch scripts than yml config of dataconnector.