我已经用“gawk”撰写了一个方案,从互联网上下载许多小片胎儿。 (一名媒体扫描仪和索引员)
目前,发射的目的是获取信息。 这是罚款,但我想到simply再利用休职之间的联系。 该方案的开办时间可能在200-2000年之间向同一台电话服务。
I ve just discovered that gawk can do networking and found geturl However the advice at the bottom of that page is well heeded, I can t find an easy way to read the last line and keep the connection open.
由于我多读过JSON数据,我可以确定RS=“}”,在体积达到预期内容长度时,可以离开。 这可能会打破任何 trail的白色空间。 我喜欢更有力的做法。 是否有任何人用不定期的网上要求,使连接保持开放。 目前,我有以下结构......
con="/inet/tcp/0/host/80";
send_http_request(con);
RS="
";
read_headers();
# now read the body - but do not close the connection...
RS="}"; # for JSON
while ( con |& getline bytes ) {
body = body bytes RS;
if (length(body) >= content_length) break;
print length(body);
}
# Do not close con here - keep open
Its a shame this one little thing seems to be spoiling all the potential here. Also in case anyone asks :) ..
- awk was originally chosen for historical reasons - there were not many other language options on this embedded platform at the time.
- Gathering up all of the URLs in advance and passing to wget will not be easy.
- re-implementing in perl/python etc is not a quick solution.
- I ve looked at trying to pipe urls to a named pipe and into wget -i - , that doesn t work. Data gets buffered, and unbuffer not available - also I think wget gathers up all the URLS until EOF before processing.
- The data is small so lack of compression is not an issue.