English 中文(简体)
与Lisp相关的奇怪的HTTP问题/错误
原标题:
  • 时间:2009-01-15 04:01:12
  •  标签:

我试图学习一些关于在SBCL中处理套接字和网络连接的知识;因此我写了一个简单的HTTP包装器。到目前为止,它仅仅创建了一个流,并执行请求,最终获取网站的头信息和页面内容。

直到现在,它表现还算可以。虽然不值得夸耀,但至少还能用。

我遇到了一个奇怪的问题,然而;我一直收到“400 Bad Request”错误。

起初,我对我如何处理HTTP请求有些谨慎(更多或少将请求字符串作为函数参数传递),然后我编写了一个格式化查询字符串的函数,其中包含我需要的所有部分并返回以供稍后使用...但我仍然会出现错误。

更加奇怪的是这些错误不是每次都发生。如果我在像Google这样的页面上尝试运行脚本,我会得到一个“200 Ok”的返回值...但在其他网站上的其他时间,我会得到“400 Bad Request”。

我相信这是我的代码的问题,但我却不知道具体是什么原因引起的。

这是我正在使用的代码:

(use-package :sb-bsd-sockets)

(defun read-buf-nonblock (buffer stream)
  (let ((eof (gensym)))
    (do ((i 0 (1+ i))
         (c (read-char stream nil eof)
            (read-char-no-hang stream nil eof)))
        ((or (>= i (length buffer)) (not c) (eq c eof)) i)
      (setf (elt buffer i) c))))

(defun http-connect (host &optional (port 80))
"Create I/O stream to given host on a specified port"
  (let ((socket (make-instance  inet-socket
                   :type :stream
                   :protocol :tcp)))
    (socket-connect
     socket (car (host-ent-addresses (get-host-by-name host))) port)
    (let ((stream (socket-make-stream socket
                    :input t
                    :output t
                    :buffering :none)))
      stream)))

(defun http-request (stream request &optional (buffer 1024))
"Perform HTTP request on a specified stream"
  (format stream "~a~%~%" request )
  (let ((data (make-string buffer)))
    (setf data (subseq data 0
               (read-buf-nonblock data
                      stream)))
    (princ data)
    (> (length data) 0)))

(defun request (host request)
"formated HTTP request"
  (format nil "~a HTTP/1.0 Host: ~a" request host))

(defun get-page (host &optional (request "GET /"))
"simple demo to get content of a page"
  (let ((stream (http-connect host)))
    (http-request stream (request host request)))
最佳回答

有几件事情。首先,您对于得到400错误的担忧,有几种可能性:

  • "Host:" isn t actually a valid header field in HTTP/1.0, and depending on how fascist the web server you are contacting is about standards, it would reject this as a bad request based on the protocol you claim to be speaking.
  • You need a CRLF between your Request-line and each of the header lines.
  • It is possible that your (request) function is returning something for the Request-URI field -- you substitute in the value of request as the contents of this part of the Request-line -- that is bogus in one way or another (badly escaped characters, etc.). Seeing what it is outputting might help out some.

一些更一般性的指示可以帮助你走得更顺利:

  • “read-buf-nonblock”非常令人困惑。符号“c”定义在哪里?为什么“eof”(gensym)被定义,并且没有被赋任何值?这看起来非常像从过程式程序中直接复制的逐字节副本,并且被直接放到了Lisp中。看起来你所重新实现的是“read-sequence”。请到Common Lisp Hyperspec的这里查看是否需要此功能。另一半的任务是将所创建的套接字设置为非阻塞模式。尽管SBCL文档对该主题几乎没有说明,但这很容易。请使用如下代码:

    (socket-make-stream socket :input t :output t :buffering :none :timeout 0)

  • (http-connect) 的最后一个(让)形式并不必要。只需要评估。

    (socket-make-stream socket :input t :output t :buffering :none)

没有“let”, HTTP连接仍应返回正确的值。

  • In (http-request)...

替换:

 (format stream "~a~%~%" request )
 (let ((data (make-string buffer)))
 (setf data (subseq data 0
            (read-buf-nonblock data
                               stream)))
 (princ data)
 (> (length data) 0)))

带着 (dài zhe)

(format stream "~a~%~%" request )
(let ((data (read-buf-nonblock stream)))
    (princ data)
    (> (length data) 0)))

and make (read-buf-nonblock) return the string of data, rather that having it assign 带着 (dài zhe)in the function. So where you have buffer being assigned, create a variable buffer 带着 (dài zhe)in and then return it. What you are doing is called relying on "side-effects," and tends to produce more errors and harder to find errors. Use it only when you have to, especially in a language that makes it easy not to depend on them.

  • I mostly like the the way get-page is defined. It feels very much in the functional programming paradigm. However, you should either change the name of the (request) function, or the variable request. Having both in there is confusing.

哎呀,手很痛。但希望这有所帮助。打完了。 :-)

问题回答

这是一个可能性:

HTTP / 1.0将序列CR LF定义为行末标记。

~% 格式指令会生成一个 #Newline(在大多数平台上为 LF,但请参见CLHS)。

有些网站可能会容忍缺失的回车,而有些则不会那么容忍。





相关问题
热门标签