fix(data_collector): fix us_index collector.py Http Error 403 Forbidden; Remove FutureWarning#2047
Merged
Merged
Conversation
Contributor
Author
|
@microsoft-github-policy-service agree |
1 similar comment
Contributor
Author
|
@microsoft-github-policy-service agree |
Collaborator
|
Hi, @kzhdev It's nice to see the code you've contributed, and I think you're helping to make Some suggestions:
|
Contributor
Author
|
I pushed another commit to address your suggestions. Please take another look. |
Collaborator
|
Hi, @kzhdev Thanks for your contribution, It looks great now. Really appreciate your help improving the project! |
frydaiii
pushed a commit
to frydaiii/qlib
that referenced
this pull request
Dec 5, 2025
…en; Remove FutureWarning (microsoft#2047) * Fix 403 Forbidden error; Remove FutureWarning: * use fake_useragent * Fix lint format error * Add timeout to fix pylint error
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Motivation and Context
us_index collector.py stopped working which caused by the following error:
Traceback (most recent call last):
File "Y:\repo\qlib\scripts\data_collector\us_index\collector.py", line 273, in
fire.Fire(partial(get_instruments, market_index="us_index"))
File "C:\Users\auror\miniforge3\envs\qlib\Lib\site-packages\fire\core.py", line 135, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\auror\miniforge3\envs\qlib\Lib\site-packages\fire\core.py", line 559, in _Fire
component, remaining_args = _CallAndUpdateTrace(
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\auror\miniforge3\envs\qlib\Lib\site-packages\fire\core.py", line 684, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^
File "\Trade-server\d\repo\qlib\scripts\data_collector\utils.py", line 672, in get_instruments
getattr(obj, method)()
File "\Trade-server\d\repo\qlib\scripts\data_collector\index.py", line 213, in parse_instruments
changers_df = self.get_changes()
^^^^^^^^^^^^^^^^^^
File "\Trade-server\d\repo\qlib\scripts\data_collector\us_index\collector.py", line 229, in get_changes
changes_df = pd.read_html(self.WIKISP500_CHANGES_URL)[-1]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\auror\miniforge3\envs\qlib\Lib\site-packages\pandas\io\html.py", line 1240, in read_html
return _parse(
^^^^^^^
File "C:\Users\auror\miniforge3\envs\qlib\Lib\site-packages\pandas\io\html.py", line 983, in _parse
tables = p.parse_tables()
^^^^^^^^^^^^^^^^
File "C:\Users\auror\miniforge3\envs\qlib\Lib\site-packages\pandas\io\html.py", line 249, in parse_tables
tables = self._parse_tables(self._build_doc(), self.match, self.attrs)
^^^^^^^^^^^^^^^^^
File "C:\Users\auror\miniforge3\envs\qlib\Lib\site-packages\pandas\io\html.py", line 806, in _build_doc
raise e
File "C:\Users\auror\miniforge3\envs\qlib\Lib\site-packages\pandas\io\html.py", line 785, in _build_doc
with get_handle(
^^^^^^^^^^^
File "C:\Users\auror\miniforge3\envs\qlib\Lib\site-packages\pandas\io\common.py", line 728, in get_handle
ioargs = _get_filepath_or_buffer(
^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\auror\miniforge3\envs\qlib\Lib\site-packages\pandas\io\common.py", line 384, in _get_filepath_or_buffer
with urlopen(req_info) as req:
^^^^^^^^^^^^^^^^^
File "C:\Users\auror\miniforge3\envs\qlib\Lib\site-packages\pandas\io\common.py", line 289, in urlopen
return urllib.request.urlopen(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\auror\miniforge3\envs\qlib\Lib\urllib\request.py", line 215, in urlopen
return opener.open(url, data, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\auror\miniforge3\envs\qlib\Lib\urllib\request.py", line 521, in open
response = meth(req, response)
^^^^^^^^^^^^^^^^^^^
File "C:\Users\auror\miniforge3\envs\qlib\Lib\urllib\request.py", line 630, in http_response
response = self.parent.error(
^^^^^^^^^^^^^^^^^^
File "C:\Users\auror\miniforge3\envs\qlib\Lib\urllib\request.py", line 559, in error
return self._call_chain(*args)
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\auror\miniforge3\envs\qlib\Lib\urllib\request.py", line 492, in _call_chain
result = func(*args)
^^^^^^^^^^^
File "C:\Users\auror\miniforge3\envs\qlib\Lib\urllib\request.py", line 639, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
The script also has the following FutureWarning:
scripts\data_collector\us_index\collector.py:151: FutureWarning: Passing literal html to 'read_html' is deprecated and will be removed in a future version. To read from a literal string, wrap it in a 'StringIO' object.
df_list = pd.read_html(_data.text)
How Has This Been Tested?
run
python collector.py --index_name SP500 --qlib_dir ~/.qlib/qlib_data/us_data --method parse_instrumentsmade sure the sp500.txt file is created successfully and the FutureWarning is gone.pytest qlib/tests/test_all_pipeline.pyunder upper directory ofqlib.Screenshots of Test Results (if appropriate):
Types of changes