You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR works in principle.
Be aware that there is a bug regarding the WatchService that can result in a loss of awareness of created files, e.g. if these files are created too fast in a row, or e.g. by moving a whole directory with files in it to the directory which is listened to: that directory would be watched, but the files in there won't be recognized. So it's not as inotify in unix contexts - it only comes close to it.
You may want to test it like this:
./gradlew assembleDist (in the root of this branch to build the runner)
cd ./metafacture-runner/build/distributions/
tar xfz metafacture-core-702-addDirectoryListener-SNAPSHOT-dist.tar.gz
metafacture-core-702-addDirectoryListener-SNAPSHOT-dist/flux.sh directoryListener.flux to execute the FLUX
touch 1 2 3 4 5 6 7 tmp/ to create some files
You should see as output the names of the files with absolute path (which could be given in the FLUX to open-file) and some logs (which are not going to the piped flux-command (as e.g. open-file)) are printed to stdout.
(Interestingly , you see that even "touch"ing 7 files consecutively the WatchService is fast enough to observe the creation of these files.)
If you want the listener to go down: trigger it with the specially named file: touch shutdownEtlNow
We may want to discuss if it's necessary to improve the behaviour by building some workarounds. One idea would be to just traverse the given directory every n-th second and notate the filenames, if any new appear, to a Map and push these down the pipe. This would guarantee to not miss one file (at the cost of not instantly getting the filename if one was created.)
functional review: @TobiasNx (and maybe @fsteeg as this PR is supposed to be part of a workflow in the RPB context).
discussion (re bug and workaround): also @blackwinter
code review: @blackwinter (or @fsteeg )
This creates a broken flux since the echo loses the quotes around tmp.
tobias@hbz-hp:~/temp$ '/home/tobias/git/metafacture-core/metafacture-runner/build/install/metafacture-core/flux.sh' directoryListener.flux
Exception in thread "main" org.metafacture.flux.FluxParseException: Variable tmp not assigned.
at org.metafacture.flux.parser.FlowBuilder.exp(FlowBuilder.java:604)
at org.metafacture.flux.parser.FlowBuilder.exp(FlowBuilder.java:619)
at org.metafacture.flux.parser.FlowBuilder.varDef(FlowBuilder.java:386)
at org.metafacture.flux.parser.FlowBuilder.varDefs(FlowBuilder.java:287)
at org.metafacture.flux.parser.FlowBuilder.flux(FlowBuilder.java:105)
at org.metafacture.flux.FluxCompiler.compileFlow(FluxCompiler.java:66)
at org.metafacture.flux.FluxCompiler.compile(FluxCompiler.java:54)
at org.metafacture.runner.Flux.main(Flux.java:87)
After fixing this.
Your example flux with printing seems to work fine when creating or modifing files. The logging is a little too much.
When changing print to write the process closes and the flux breaks when a file is modified or created.
tobias@hbz-hp:~/temp$ '/home/tobias/git/metafacture-core/metafacture-runner/build/install/metafacture-core/flux.sh' directoryListener.flux
Add directory to watch: /home/tobias/temp/tmp
Event kind:ENTRY_CREATE. File affected: .1.swp.
Exception in thread "Thread-0" org.metafacture.framework.MetafactureException: java.io.IOException: Stream closed
at org.metafacture.io.ObjectFileWriter.process(ObjectFileWriter.java:110)
at org.metafacture.io.ObjectWriter.process(ObjectWriter.java:147)
at org.metafacture.io.DirectoryListener$DirectoryWatcher.processFile(DirectoryListener.java:195)
at org.metafacture.io.DirectoryListener$DirectoryWatcher.run(DirectoryListener.java:166)
at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: java.io.IOException: Stream closed
at java.base/sun.nio.cs.StreamEncoder.ensureOpen(StreamEncoder.java:51)
at java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
at java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:142)
at java.base/java.io.OutputStreamWriter.write(OutputStreamWriter.java:223)
at java.base/java.io.Writer.write(Writer.java:249)
at org.metafacture.io.ObjectFileWriter.process(ObjectFileWriter.java:101)
... 4 more
What I currently also do not understand is how I am able to recognize the filenames to further process the changed or created files. What is the output of listen-directory?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See #702.
This PR works in principle.
Be aware that there is a bug regarding the WatchService that can result in a loss of awareness of created files, e.g. if these files are created too fast in a row, or e.g. by moving a whole directory with files in it to the directory which is listened to: that directory would be watched, but the files in there won't be recognized. So it's not as
inotifyin unix contexts - it only comes close to it.You may want to test it like this:
./gradlew assembleDist(in the root of this branch to build the runner)cd ./metafacture-runner/build/distributions/tar xfz metafacture-core-702-addDirectoryListener-SNAPSHOT-dist.tar.gzmkdir tmp(creates the directory to listen to)metafacture-core-702-addDirectoryListener-SNAPSHOT-dist/flux.sh directoryListener.fluxto execute the FLUXtouch 1 2 3 4 5 6 7 tmp/to create some filesYou should see as output the names of the files with absolute path (which could be given in the FLUX to
open-file) and some logs (which are not going to the piped flux-command (as e.g.open-file)) are printed to stdout.(Interestingly , you see that even "touch"ing 7 files consecutively the WatchService is fast enough to observe the creation of these files.)
If you want the listener to go down: trigger it with the specially named file:
touch shutdownEtlNowWe may want to discuss if it's necessary to improve the behaviour by building some workarounds. One idea would be to just traverse the given directory every n-th second and notate the filenames, if any new appear, to a Map and push these down the pipe. This would guarantee to not miss one file (at the cost of not instantly getting the filename if one was created.)