You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+9Lines changed: 9 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,6 +2,15 @@
2
2
3
3
All notable changes to this project will be documented in this file.
4
4
5
+
## 1.3.0 - 2024-11-18
6
+
### Added features
7
+
- Added manual/on-demand scheduler configuration option. Implements SSM Parameter to store the Step Functions event structure. Note: Impacts file timestamp comparison during runtime; recommended for on-demand use cases only.
8
+
- Added ability to configure individual schedules in `SyncSettings`. Enables different schedules for multiple folders within single SFTP connection. Individual schedules take precedence over general `Schedule` setting. Reduces resource consumption by eliminating need for multiple configuration files and prevents the creation of dedicated Transfer Family Connector (including public IPs) and Secrets per configuration file.
9
+
10
+
### Changed
11
+
- Implemented individual EventBridge Scheduler rules for each `SyncSettings` item in the configuration files.
12
+
- Simplified Step Function by removing MAP State. Individual records now processed per execution, enabling above new features and providing improved error visibility for failed executions. Further improvements planned.
13
+
5
14
## 1.2.0 - 2024-11-07
6
15
### Added features
7
16
- Added the ability to define tags for every resource created by the solution. This can be configured using the `configuration/solution_parameters/parameters.json` file.
Copy file name to clipboardExpand all lines: README.md
+80-26Lines changed: 80 additions & 26 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,12 +1,13 @@
1
1
# File Transfer Synchronization solution
2
2
## Introduction
3
3
4
-
This solution implements an automated strategy for synchronizing remote SFTP repositories with local S3 buckets. It schedules and orchestrates the process of listing remote directories, detecting changes, and transferring files.
4
+
This solution implements an automated strategy for synchronizing remote SFTP repositories with local S3 buckets. It orchestrates the process of listing remote directories, detecting changes, and transferring files. It can be run based on a scheduler or on-demand.
5
5
6
6
**The solution leverages the following AWS services:**
-[AWS Transfer Family SFTP Connectors](https://docs.aws.amazon.com/transfer/latest/userguide/creating-connectors.html)
10
+
-[AWS Systems Manager Parameter Store](https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-parameter-store.html)
10
11
11
12
**Key features:**
12
13
- Monitors remote SFTP servers using SFTP Connectors' [List capabilities](https://docs.aws.amazon.com/transfer/latest/userguide/sftp-connector-list-dir.html)
@@ -28,13 +29,19 @@ A combination of Lambda, Step Functions and Transfer Family features facilitates
28
29
29
30
### Component Interactions
30
31
31
-
1.**Event Bridge Scheduler**
32
-
- The Event Bridge Scheduler triggers the Step Function execution based on the configured schedule (e.g., daily, hourly, or a custom cron expression).
33
-
- There are multiple schedules based on the Configuration files in this project and the Event passed to Step Functions includes the required parameters according to each schedule configuration.
32
+
1.**Execution Phase**
33
+
34
+
a. **On-Demand Execution**
35
+
- You can manually execute the Step Function by using the event structure stored in SSM Parameter Store.
36
+
- While executing the Step Function, you can modify the `FromTimestamp` parameter in the event to specify the starting date and time for the file copy process.
37
+
38
+
b. **Event Bridge Scheduler**
39
+
- The Event Bridge Scheduler triggers the Step Function execution based on the configured schedule (e.g., daily, hourly, or a custom cron expression).
40
+
- There are multiple schedules based on the Configuration files in this project and the Event passed to Step Functions includes the required parameters according to each schedule configuration.
34
41
35
42
2.**Step Function**
36
43
- The Step Function orchestrates the entire process and coordinates the interaction between different components.
37
-
- For each `SyncSettings`, it invokes the `RemoteFoldersList` Lambda function interacts with the Transfer Family SFTP Connector to asynchronously retrieve a list of files in the remote folders to be synchronized.
44
+
- For each event, it invokes the `RemoteFoldersList` Lambda function interacts with the Transfer Family SFTP Connector to asynchronously retrieve a list of files in the remote folders to be synchronized.
38
45
- Then use the `GetListStatus` Lambda function, to check if the `List` process is finished and optionally get the list of child folder if `Recursive` is enabled to run a list again for those sub folders.
39
46
- The `SyncRemoteFolder` Lambda function detects if new or modified files are available in the remote server, and then invokes the Transfer Family SFTP Connector to asynchronously transfer those files from the remote repository to the local S3 bucket.
40
47
- If any errors occur during the synchronization process, the Step Function captures the error and sends a notification to the configured SNS topic.
@@ -70,11 +77,11 @@ To do so, you just need to push new configuration changes as `json` files to `./
70
77
71
78
The configuration file structure and content needs the following data:
72
79
73
-
```
80
+
```json
74
81
{
75
82
"Description": <Connection Description>,
76
83
"Name": <Identifying name for resources, no spaces allowed>,
77
-
"Schedule": <Tag or AWS Cron Expression>,
84
+
"Schedule": <Tag, AWS Cron Expression or "on-demand">,
78
85
"Url": <Remote SFTP Server URL, FQDN and Port allowed>,
79
86
"SecurityPolicyName": <TransferSFTPConnectorSecurityPolicy-2024-03 or TransferSFTPConnectorSecurityPolicy-2023-07>,
80
87
"SyncSettings": [
@@ -87,7 +94,8 @@ The configuration file structure and content needs the following data:
87
94
"RemoteFolders": {
88
95
"Folder": <Remote Folder to Sync>,
89
96
"Recursive": <true / false>
90
-
}
97
+
},
98
+
(OPTIONAL) "Schedule": <Tag, AWS Cron Expression or "on-demand">
91
99
},
92
100
{ ... }
93
101
],
@@ -101,23 +109,69 @@ The configuration file structure and content needs the following data:
101
109
You can check the [example configuration file](configuration/examples/example-sftp-sync.json). Within AWS Account service limits, you can have as many configuration files as you need, and on the `SyncSettings` configuration list, you can define as many Remote to Local pairs as you wish and all will be run during the same schedule for the same Remote SFTP Server.
102
110
The CDK Application will automatically resolve all the IAM Role permissions needed for the process to work and will create all the needed resources, including Event Bridge Scheduler, SFTP Connector and Secrets Manager Secret.
103
111
104
-
### Cron configuration
105
-
For the Cron expression, you can use any of the pre-defined TAGs for simplicity or you can define your own cron expression. Keep in mind that this needs to be an [AWS Event Bridge Cron expression format](https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-scheduled-rule-pattern.html#eb-cron-expressions). Available TAGs are:
106
-
107
-
| TAG | Expression |
108
-
| :---------------- | :------: |
109
-
|@monthly| 0 0 1 * ? * |
110
-
|@daily| 0 0 * * ? * |
111
-
|@hourly| 0 * * * ? * |
112
-
|@minutely| * * * * ? * |
113
-
|@sunday| 0 0 ? * 1 * |
114
-
|@monday| 0 0 ? * 2 * |
115
-
|@tuesday| 0 0 ? * 3 * |
116
-
|@wednesday| 0 0 ? * 4 * |
117
-
|@thursday| 0 0 ? * 5 * |
118
-
|@friday| 0 0 ? * 6 * |
119
-
|@saturday| 0 0 ? * 7 * |
120
-
|@every10min| 0/10 * * * ? * |
112
+
### Schedule Configuration
113
+
114
+
The file synchronization process can be configured to run based on a schedule or on-demand. The solution supports both global schedules for entire configurations and individual schedules for specific sync settings.
115
+
116
+
#### Scheduling Strategies
117
+
118
+
1.**Cron / Tag Schedule**:
119
+
- Only considers files created in the remote repository between the current execution timestamp and the previous execution timestamp.
120
+
- Useful for regular, periodic synchronization while avoiding duplicate transfers.
121
+
122
+
2.**On-Demand execution**:
123
+
- When the `Schedule` value is configured as `on-demand`, at the Step Function execution phase you can set up an additional optional parameter called `FromTimestamp` that allows you to define from when (UTC Timestamp) files are considered to be copied.
124
+
- By default the value for `FromTimestamp` is set to 0, meaning that all files (newer than 1 January 1970 00:00:00) will be compared.
125
+
- Copy any modify files from the timestamp specified, including files that may have been deleted from S3 between runs but still exist on the remote SFTP server.
126
+
127
+
#### Individual Sync Setting Schedules
128
+
129
+
From version 1.3.0, you can define specific schedules for each item in your `SyncSettings` configuration list. This allows for:
130
+
- Different schedules for multiple folders within a single SFTP connection
131
+
- More granular control over synchronization timing
132
+
- Reduced resource consumption by eliminating the need for multiple configuration files
133
+
134
+
Individual item level schedules take precedence over the general `Schedule` setting.
135
+
136
+
**Example scenario:**
137
+
you need to synchronize a remote SFTP Server with 10 folders, 5 of those are updates once a day at midnight, 2 are updated hourly, 1 is updated weekly and for the remaining 2 you get notified when there are new files to run an on-demand copy. Before this update, you would have need to create 4 Configuration files, each with its dedicated Transfer Family Connector, public IPs and Secrets. Today you can create a single Configuration file (and it's resources) with different `Schedule` parameters for each item in the `SyncSettings` array according to the business needs.
* Custom Cron Expressions, keep in mind that this needs to be an [AWS Event Bridge Cron expression format](https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-scheduled-rule-pattern.html#eb-cron-expressions)
168
+
* "on-demand" for manual execution
169
+
170
+
#### Best Practices
171
+
* Choose schedules that align with your data update frequency
172
+
* Use individual schedules for folders with different update patterns
173
+
* Consider resource usage and costs when setting frequent schedules
174
+
* Test your configuration to ensure it meets your synchronization needs
121
175
122
176
### Target Bucket KMS Encryption
123
177
@@ -187,7 +241,7 @@ After the first replication, the solution will only copy new or modified files f
187
241
This project is built using Python3 and CDK, before you start, make sure to have all the pre requirements properly installed in your environment.
0 commit comments