Skip to content

Commit 5c63521

Browse files
committed
File.dirname: add a spec for Shift JIS handling
While trying to speedup various `File.*` methods, I realized they were way slower and complicated than they should for no apparent reason. However after asking Nobu he explained that Shift JIS encoded text can contain `0x5C` (ASCII backslash) as the second byte of a two byte character sequence. Since on Windows `0x5C` is `File::ALT_SEPARATOR`, this can easily break naive path related algorithms searching for directory separators.
1 parent 2a98670 commit 5c63521

1 file changed

Lines changed: 29 additions & 0 deletions

File tree

core/file/dirname_spec.rb

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,28 @@ def object.to_int; 2; end
7878
File.dirname("foo/../").should == "foo"
7979
end
8080

81+
it "rejects strings encoded with non ASCII-compatible encodings" do
82+
Encoding.list.reject(&:ascii_compatible?).reject(&:dummy?).each do |enc|
83+
path = "/foo/bar".encode(enc)
84+
-> {
85+
File.dirname(path)
86+
}.should raise_error(Encoding::CompatibilityError)
87+
end
88+
end
89+
90+
it "works with all ASCII-compatible encodings" do
91+
Encoding.list.select(&:ascii_compatible?).each do |enc|
92+
File.dirname("/foo/bar".encode(enc)).should == "/foo".encode(enc)
93+
end
94+
end
95+
96+
it "handles Shift JIS 0x5C (\\) as second byte of a multi-byte sequence" do
97+
# dir/fileソname.txt
98+
path = "dir/file\x83\x5cname.txt".b.force_encoding(Encoding::SHIFT_JIS)
99+
path.valid_encoding?.should be_true
100+
File.dirname(path).should == "dir"
101+
end
102+
81103
platform_is_not :windows do
82104
it "ignores repeated leading / (edge cases on non-windows)" do
83105
File.dirname("/////foo/bar/").should == "/foo"
@@ -98,6 +120,13 @@ def object.to_int; 2; end
98120
File.dirname("//foo//").should == "//foo"
99121
File.dirname('/////').should == '//'
100122
end
123+
124+
it "handles Shift JIS 0x5C (\\) as second byte of a multi-byte sequence (windows)" do
125+
# dir/fileソname.txt
126+
path = "dir\\file\x83\x5cname.txt".b.force_encoding(Encoding::SHIFT_JIS)
127+
path.valid_encoding?.should be_true
128+
File.dirname(path).should == "dir"
129+
end
101130
end
102131

103132
it "accepts an object that has a #to_path method" do

0 commit comments

Comments
 (0)