Improve fetch script, ditch the serial number

This commit is contained in:
Nemo 2022-04-25 12:38:09 +05:30
parent 31c1a080ab
commit 4553266282
3 changed files with 31 additions and 15 deletions

View File

@ -22,10 +22,19 @@ jobs:
format: "YYYY.M.D" format: "YYYY.M.D"
- name: Update data - name: Update data
run: ./fetch.sh run: ./fetch.sh
# Only tag if we're running on the scheduled job
- uses: stefanzweifel/git-auto-commit-action@v4 - uses: stefanzweifel/git-auto-commit-action@v4
if: ${{ github.event_name == 'schedule' }}
with: with:
commit_message: Update ISIN Data commit_message: Update ISIN Data
commit_author: 'github-actions[bot] <github-actions[bot]@users.noreply.github.com>' commit_author: 'github-actions[bot] <github-actions[bot]@users.noreply.github.com>'
file_pattern: "*.csv" file_pattern: "*.csv"
status_options: '--untracked-files=no' status_options: '--untracked-files=no'
tagging_message: "v${{ steps.current-time.outputs.formattedTime }}" tagging_message: "v${{ steps.current-time.outputs.formattedTime }}"
- uses: stefanzweifel/git-auto-commit-action@v4
if: ${{ github.event_name == 'push' }}
with:
commit_message: Update ISIN Data
commit_author: 'github-actions[bot] <github-actions[bot]@users.noreply.github.com>'
file_pattern: "*.csv"
status_options: '--untracked-files=no'

View File

@ -2,23 +2,25 @@
ISIN Data from various public securities. ISIN Data from various public securities.
Source: NSDL provides a ISIN Search at <https://nsdl.co.in/master_search.php>. Source: [NSDL Website Detailed ISIN Search][nsdl].
Automatically updated every Sunday using GitHub Actions. Automatically updated every Sunday using GitHub Actions.
Currently tracked: Currently tracked:
|File|Issuer| |File|Issuer|Tracked|
-----|----- -----|-----|----|
`INA.csv`|Central Government `INA.csv`|Central Government|No
`INB.csv`|State Government `INB.csv`|State Government|No
`INE.csv`|Company, Statuatory Corporation, Banking Company `INE.csv`|Company, Statuatory Corporation, Banking Company|Yes
`INF.csv`|Mutual Funds `INF.csv`|Mutual Funds|Yes
`IN9.csv`|Partly paid up shares `IN9.csv`|Partly paid up shares|Yes
**Note**: The [NSDL Website][nsdl] returns zero valid results for `INA, INB`, so those are not tracked.
# Code # Code
You can run the `fetch.sh` script to generate all the files from scratch. Dependencies: You can run the `fetch.sh` script to generate the tracked the files from scratch. Dependencies:
- https://github.com/ericchiang/pup - https://github.com/ericchiang/pup
- https://stedolan.github.io/jq/ - https://stedolan.github.io/jq/
@ -27,9 +29,13 @@ You can run the `fetch.sh` script to generate all the files from scratch. Depend
# Structure # Structure
See https://www.basunivesh.com/how-your-dmat-mutual-funds-and-shares-isin-structured/ - https://www.basunivesh.com/how-your-dmat-mutual-funds-and-shares-isin-structured/
- https://theindianstockbrokers.com/what-is-isin-number-and-how-to-find-it/
# Alternative Sources # Alternative Sources
- https://nsdl.co.in/downloadables/html/hold-mutual-fund-units.html - https://nsdl.co.in/downloadables/html/hold-mutual-fund-units.html
- [The Kuvera Mutual Fund Details API](https://stoplight.captnemo.in/docs/kuvera/reference/Kuvera.yaml/paths/~1mf~1api~1v4~1fund_schemes~1%7Bcodes%7D.json/get) returns ISIN codes. - [The Kuvera Mutual Fund Details API](https://stoplight.captnemo.in/docs/kuvera/reference/Kuvera.yaml/paths/~1mf~1api~1v4~1fund_schemes~1%7Bcodes%7D.json/get) returns ISIN codes.
[nsdl]: https://nsdl.co.in/master_search.php

View File

@ -21,9 +21,7 @@ function fetch_page() {
--connect-timeout 10 \ --connect-timeout 10 \
--retry-max-time 30 \ --retry-max-time 30 \
--data cnum=$1 \ --data cnum=$1 \
--data "page_no=$2" | --data "page_no=$2" | $PUP_BINARY '#nsdl-tables tr json{}' | \
# for each row
$PUP_BINARY '#nsdl-tables tr json{}' | \
# generate 6 lines (second column has a link, so parse that) with raw output # generate 6 lines (second column has a link, so parse that) with raw output
jq --raw-output '.[] | [.children[1].children[0].text, .children[2].text, .children[3].text,.children[4].text,.children[5].text]|.[]' | \ jq --raw-output '.[] | [.children[1].children[0].text, .children[2].text, .children[3].text,.children[4].text,.children[5].text]|.[]' | \
# and create a CSV from every 5 lines # and create a CSV from every 5 lines
@ -47,11 +45,14 @@ function fetch_class() {
done done
} }
for i in A B E F 9; do for i in E F 9; do
total=$(fetch_total_pages "IN$i") total=$(fetch_total_pages "IN$i")
echo "::group::IN$i (Total=$total)" echo "::group::IN$i (Total=$total)"
rm "IN$i.csv"
fetch_class "IN$i" $total fetch_class "IN$i" $total
echo "::endgroup::" echo "::endgroup::"
# Sort the file in place
sort -o "IN$i.csv" "IN$i.csv"
done done
sem --wait sem --wait