English 中文(简体)
理论计算机科学何时有用?
原标题:
  • 时间:2008-10-24 22:00:55
  •  标签:

在课堂上,我们学习了停机问题、图灵机、规约等内容。许多同学说这些都是抽象无用的概念,而且学了也没有什么实际用途(也就是说,课程结束后可以忘记它们,而不会失去什么)。

为什么理论有用?你是否在日常编码中使用它?

最佳回答

当我大学毕业时,我认为我和其他人一样:“我有计算机科学学位,很多人也有,我们基本上可以做同样的事情。”最终我发现我的假设是错误的。我很出色,我的背景也有很多关系,尤其是我的学位。

我知道一个“轻微”的区别,那就是我获得了计算机科学的学士学位,因为我的大学是全国第二个(1987年据说是第二个)获得计算机科学学位认证的大学之一,并且我是第二个获得该认证学位的班级毕业生。那时,我并没有认为这很重要。

我在高中和大学期间也注意到,我在计算机科学方面表现得特别好 - 比我的同龄人甚至许多老师都要好。我经常被要求提供帮助,进行一些辅导,被邀请参与研究项目,并被允许在其他人不能进行独立研究时独立研究。我很高兴能够提供帮助,但我并没有多想这种差异。

在美国空军学院毕业后,我在空军服役了四年,其中两年应用了我的计算机科学学位。在那里,我注意到很少有同事拥有与计算机相关的学位或培训。空军将我送去参加为期五个月的认证培训,我再次发现缺乏学位或培训。但是在这里,我开始注意到了区别——许多我遇到的人实际上并不知道自己在做什么,这包括那些有培训或学位的人。请允许我举例说明。

在我的空军认证培训中,共有十三个人(包括我自己)。作为空军军官或等价职务,我们全部有学士学位。在年龄和军衔上,我处于中等水平(在六名O-1和六名O-3以上的军官中,我是一个O-2)。在这个培训结束时,空军认为我们都能够获得、建造、设计、维护和操作国防部的任何计算机或通信系统。

然而,在我们中的13个人中,只有6个拥有任何形式的计算机相关学位;另外7个人拥有从航空航天学到化学/生物学再到心理学各种学位。在我们6个拥有计算机科学学位的人中,我发现有两个从未编写过任何类型的程序,并且从未超过正常使用电脑(撰写论文,玩游戏等)。我了解到,其他两个人在计算机科学学位课程期间只在一台计算机上编写过一次程序。只有我和另外一个人编写了多个程序并使用了多种计算机 - 的确,我们发现我们两个编写了很多程序并使用了多种计算机。

在我们五个月的培训结束时,我们的班级被分配了一个编程项目,并被分成四个小组分别承担。我们的导师为了公平分配“编程天赋”,将班级分成了几个小组,并分配了团队领导、技术领导和开发人员的角色。每个小组都有一周的时间在一个由教练提供的飞行力学库之上(这是1990年),实现一个全屏、基于文本的用户界面,用于飞行模拟器。我被分配为我的四人团队的技术领导。

我的团队领导(没有计算机学位)要求我们这三个人将项目分成任务,然后将其中的三分之一分配给每个人。我在第一天中间完成了我三分之一的任务,然后花了其余的时间帮助我的另外两个队友,跟我的团队领导聊天(胡扯 ;^),还有在电脑上玩游戏。

第二天早上,我的团队领导私下告诉我,我们的另外两名队友没有取得任何进展(其中一人实际上无法编写可编译的“if”语句),他请求我接管他们的工作。我在下午中旬完成了整个项目,我的团队其余时间都在飞行模拟器。

另一个拥有相当CS学位的人也被指派为他的团队的技术领导,并在第三天结束时完成了。他们也开始飞行其模拟器。到本周结束时,其他两个团队都没有完成,也没有取得重大进展。我们不被允许帮助其他团队,因此就这样留下了。

同时,在第三天中间,我注意到飞行模拟器似乎表现得“不对劲”。由于我的一个同学拥有航空学学位,我问了他。他很困惑,然后坦白说他实际上不知道是什么让飞机飞行?!我感到震惊!原来,他的整个学位课程都是关于安全和事故调查的——飞行背后没有真正的数学或科学。另一方面,我可能有一个航空学的辅修(还记得美国空军学院吗?),但我们设计了机翼并进行了真正的风洞测试。因此,我安静地花了一周时间重写教师提供的飞行力学库,直到模拟器飞行“正确”。

从那时起,我作为承包商和雇员已经度过了将近20年的时间,始终从事软件开发和相关活动(DBA、架构师等)。我继续寻找更多的同类工作,最终我放弃了我的年轻假设。

So, what exactly have I discovered? Not every one is equal, and that is okay--I am not a better person because I can program effectively, but I am more useful IF that is what you need from me. I learned that my background really mattered: growing up in a family of electricians and electrical engineers, building electronics kits, reading LITERALLY every computer book in the school/public libraries because I did not have access to a real computer, then moving to a new city where my high school did have computers, then getting my own computer as a gift, going to schools that had computers of many different sizes and kinds (PCs to mainframes), getting an accredited degree from a VERY good engineering school, having to write lots of programs in different languages on different kinds of computers, having to write hard programs (like my own virtual machine with a custom assembly language, or a Huffman compression implementation, etc.), having to troubleshoot for myself, building my own computers from parts and installing ALL the software, etc.

最终,我发现我的能力是建立在对电脑的理解之上的,从电气水平的离散电子元件,电路,子系统,接口,协议,比特,字节,处理器,设备,驱动程序,库,程序,系统,网络,一直到我现在经常处理的大规模企业级集团。所以,当这个该死的东西出问题时,我知道如何和为什么出了问题。我知道什么是无法完成的,以及可以完成的。我也知道很多已经完成的工作,已经尝试过的工作,以及仍然相对未探索的工作。

最重要的是,在我学到所有这些之后,我意识到我什么都不知道。面对潜在而可知的一切,我的知识是微不足道的。

我对此感到十分满意。但是我建议你尝试一下。

问题回答

真实故事:

当我从研究生院毕业后得到了我的第一份编程工作时,我工作的公司的老板是飞行员。我被雇用几周后,其中一位老板问了我这个问题:

There are 106 airports in Arkansas. Could you write a program that would find the shortest rout necessary to land at each one of them?

我曾经认真地以为他在对我进行旅行推销员问题和 NP-完备性方面的知识测试。但后来发现他并不是这么想的。他对此一无所知。他真正想要的是一个可以找到最短路径的程序。当我解释说存在着106阶乘的解决方案,而且找到最好的解决方案是一个已知的计算难题时,他非常惊讶。

那是一个例子。

当然有用。

想象一下开发者正在开发一个模板引擎。你知道这种事情...

Blah blah blah ${MyTemplateString} blah blah blah.

它始于简单,用一个俏皮的正则表达式来执行替换。

但是逐渐地,模板会变得更加花哨,并且开发者会包括对字符串列表和地图进行模板化的功能。为了实现这一点,他编写了一个简单的语法并生成了解析器。

变得非常有技巧性,模板引擎可能最终包括条件逻辑的语法,以根据参数的值显示不同的文本块。

具有计算机科学理论背景的人会意识到模板语言正逐渐变得Turing完备,可能解释器模式是实现它的一种好方法。

一旦为模板构建了解释器,计算机科学家就可能注意到大部分请求都是重复的,一遍又一遍地重新生成相同的结果。因此,开发了一个缓存,所有请求都在执行昂贵的转换之前通过缓存路由。

另外,有些模板比其他模板复杂得多,需花费更长的时间进行渲染。也许有人会想到在渲染模板之前估计每个模板的执行时间。

但等等!!! 团队中的某个人指出,如果模板语言确实是图灵完备的,那么估计每个渲染操作的执行时间的任务就是停机问题的一个实例!! 太可怕了,不要那样做!!

关于理论的事情在实践中的意义在于所有实践都建立在理论上。理论上说。

我使用得最多的东西:

  • computational complexity to write algorithms that scale gracefully
  • understanding of how memory allocation, paging, and CPU caching work so I can write efficient code
  • understanding of data structures
  • understanding of threading, locking, and associated problems

关于图灵机等内容,我认为它很重要,因为它定义了我们所处情境的限制。这一点很值得理解。

这是学习代数和被教如何使用计算器之间的区别。

如果你懂代数,你会意识到同一个问题可能会有不同的表现形式,你也理解将问题转化为更简洁的形式的规则。

如果你只会使用计算器,你可能会浪费很多时间在一个已经解决、无法解决或者与其他问题类似(已解决或未解决)的问题上,因为你无法识别它们因为它们以不同的形式出现。

假装一会儿计算机科学是物理学…这个问题会显得愚蠢吗?

我的一个朋友正在使用一些模板工作在一个语言上。我被邀请进行一些咨询。我们讨论的部分是关于模板功能的,因为如果这些模板是图灵完备的,他们就必须考虑虚拟机的特性以及他们的编译器是否支持它。

我的故事到目前为止是这样的:自动机理论仍被教授,因为它仍然具有相关性。停机问题仍然存在,并提供了您能够做什么的界限。

现在,这些东西是否与敲击C#代码的数据库爱好者相关?可能没有。但当你开始进阶,你会希望了解你的根基和基础。

尽管我不直接在日常工作中应用它们,但我知道我的正式计算机科学教育影响了我的思考过程。我确实从一开始就避免了某些错误,因为我有正式方法所教导的教训。

他们学习时可能觉得这没有什么用,但我打赌,你的同学最终会遇到一个问题,他们会使用所学的知识或至少运用其思维模式。

上蜡...下蜡...上蜡...下蜡...这和空手道有什么关系呢?

在一份工作中,我被指派改进我们电力分布网络追踪算法的任务,因为他们使用的算法速度太慢了。三相网络本质上是三个 n-树(因为电力网络中不允许出现回路)。网络节点存储在数据库中,原始团队中的一些成员无法弄清如何构建内存模型,因此追踪是通过在数据库上进行连续的深度 SELECTs,每个相位上过滤来完成的。因此,要追踪距离变电站十个节点的节点,需要至少进行十次数据库查询(原始团队成员是数据库技术专家,但缺乏良好的算法背景)。

我编写了一个解决方案,将数据库中的 3 个 n-tree 节点网络转换为数据结构,其中每个节点仅存储一次在节点结构数组中,并且使用数组内的双向链接指针将 n-tree 关系转换为三个二叉树,以便可以轻松地向任一方向跟踪网络。

它至少快两个数量级,长下行跟踪则快三倍。

悲哀的是,我不得不向其他一些接受数据库和VB培训的程序员实际上教授n-树、二叉树、指针和双向链表,以便他们理解算法。

这是一个经典的二分法,介于“如何”和“什么”之间。你的同学们正在关注如何编写软件,并且他们非常专注于近视点;从这个角度来看,从实现的角度来看,似乎了解如停机状态和图灵机等内容是不重要的。

“如何”在计算机科学中实际工作量很少,我知道的大多数成功工程师可能认为它不到实际工作量的20%。“做什么”要远比“如何”重要。在这方面,计算机科学的基本原理至关重要。“想要做什么”与设计的关系要比实现的关系更密切,而良好的设计…嗯,就称其为“非平凡”吧。

有两种构建软件设计的方法:一种方式是使它如此简单,以至于明显没有缺陷,另一种方式是使它如此复杂,以至于没有明显的缺陷。第一种方法更加困难。- C.A.R. Hoare

祝你学业顺利!

我认为理解计算的基本模型是有用的:当然,从实践角度来看,您永远不需要能够将图灵机转换为寄存器机,但学习如何看到两个非常不同的问题实际上是同一概念的实例的能力是至关重要的。

大多数知识并不“实际”,但可以帮助你以不可预知的方式连接点,或者让你拥有更丰富的词汇,以描述更复杂的想法。

重要的不是你研究的具体问题,而是你通过研究它们所学到的原则。我每天在工作中都会使用关于算法、数据结构、编程语言和操作系统的概念。如果你作为程序员工作,你会时常做出影响系统性能的决策。你需要在基本抽象概念上有一个坚实的基础,才能做出正确的选择。

我在毕业后也曾认为:我们所学的那一堆理论在实践中完全没有用处。这在短时间内被证明是正确的,但是当你处理复杂的任务时,理论肯定比实践更有价值。每个编码几年后都可以编写一些“工作”的程序,但并不是每个人都能理解它们的工作原理。无论如何,我们大多数人在某个时刻都会处理性能问题、网络延迟、精度、可伸缩性等问题。在这个阶段,理论至关重要。为了在处理复杂系统时设计出一个好的解决方案,了解内存管理的工作原理、处理和线程的概念以及如何为它们分配内存,或者用于性能的有效数据结构等都非常重要。

例如,有一次我正在处理一个包括大量数学计算的项目,但在某一点上我们的软件出现了故障。在调试时,我发现在某些数学运算后,我收到的一个数字是值为1.000000000002的DOUBLE,但从数学角度来看不能>1,这在程序的某个后期阶段会导致传说中的NaN异常。我花了一些时间去弄清楚这个问题,但如果我更注意“Double和Float的近似值”课程,我就不会浪费那个时间了。

如果您在一家有突破性工作的公司工作,能够向架构师和开发人员传达好处是非常重要的。各种技术都存在很多炒作,摆脱这种困境非常棘手。当您以科学和理论术语来定义创新时,您会明显处于优势地位,并且客户会感知您是真实的。我能告诉大家:有一种新的处理状态、编码和非确定性(即复杂性)的方法,您肯定比今天更有生产力。

如果您在职业生涯中有长远的眼光,学习理论将赋予您深度,这种深度是您成长所需的。学习第五或第六种编程语言的投资回报率比学习第二和第三种要低得多。接触理论将让您对真正的工程有所感觉,了解自由度在哪里以及如何进行正确的取舍。

重要的概念包括1)状态,2)编码,3)非确定性。如果您不了解它们,它们将无法帮助您。理论应该帮助您了解整体情况,让您知道基本概念如何结合在一起,并帮助您提高直觉。 Zhòng yào de gàiniàn bāokuò 1) zhuàngtài, 2) biānmǎ, 3) fēi quèdìng xìng. Rúguǒ nín bù liǎojiě tāmen, tāmen jiāng wúfǎ bāngzhù nín. Lǐlùn yīnggāi bāngzhù nín liǎojiě zhěngtǐ qíngkuàng, ràng nín zhīdào jīběn gàiniàn rúhé jiéhé zài yìqǐ, bìng bāngzhù nín tí gāo zhízhì.

例子:上面的一些答案提到了停机问题和图灵机。当我在大学时读到图灵的论文时,我并没有感到开明。有一天我遇到了哥德尔的不完备性定理和哥德尔编号等内容,一切开始变得清晰起来。几年后我在一家书店看到了乔治·康托尔的书籍,现在我真正开始理解图灵机和停机问题。试着自己查一下维基百科上的“康托尔对角线论证”,那是你将会遇到的最令人敬畏的思维体验之一。

思考的食物:一个典型的图灵机并不是设计状态转换机的唯一方式。一个有两个而不是一个磁带的图灵机将为许多算法提供更高的速度。http://www.math.ucla.edu/~ynm/papers/eng.ps

你可以通过阅读这本书更有效地接触这些见解,比我做得更好。本帖底部的链接。 (至少,在亚马逊上查看目录,尝尝这是关于什么的口味):

我认为Rosenberg的这本书非常精彩。如果你只有一本理论方面的书,我认为这本书应该是最好的选择。

我不是每天都使用它。但它給了我很多理解,每天都对我有所帮助。

我发现我在计算机科学理论领域每天所需的幸福感,就是念诵“低耦合,高内聚”的口头禅。Roger S. Pressman在Steve McConnell之前将其学术化,Steve McConnell之后将其变得时尚。

Ya, I generally use state diagrams to design the shape and flow of the program. Once it works in theory, I start coding and testing, checking off the states as i go.

我发现它们也是向他人解释过程行为的有用工具。

Simple. For example: if I m using RSACryptoServiceProvider I d like to know what is that and why I can trust it.

Because C++ templates are actually some kind of lambda calculus. See www.cs.nott.ac.uk/types06/slides/michelbrink_types_06.pdf

I m studying for my Distributed algorithms course now. There is a chapter about fault tolerance and it contains some proofs on the upper bound for how many processes can fail (or misbehave) so that the distributed algorithm can handle it correctly.

For many problems, the bound for misbehaving processes is up to one third of total number of processes. This is quite useful in my opinion because you know that it s pointless to try to develop a better algorithm (under given assumptions).

Even if theoretical courses aren t going to be used directly, it might help you think better of something.

You don t know what your boss is going to ask you to do, you may have to use something that you thought it won t be benefical, as Jeffrey L Whitledge said.

To be honest, I sort of disagree with a lot of the answers here. I wrote my first compiler (for fun; I really have too much coffee/free time) without having taken a course in compilers; basically I just scanned the code for another compiler and followed the pattern. I could write a parser in C off the top of my head, but I don t think I could remember how to draw a pushdown automaton if my life depended on it.

When I decided I wanted to put type inference in my toy (imperative) programming language, I first looked over probably five papers, staring at something called "typed lambda calculus" going what.... the.... ****....? At first I tried implementing something with "generic variables" and "nongeneric variables" and had no idea what was going on. Then I scrapped it all, and sat there with a notebook figuring out how I could implement it practically with support for all the things I needed (sub-typing, first-class functions, parameterized types, etc.) With a couple days of thinking & writing test programs, I blew away more than a weeks worth of trying to figure out the theoretical crap.

Knowing the basics of computing (i.e. how virtual memory works, how filesystems work, threading/scheduling, SMP, data structures) have all proved HIGHLY useful. Complexity theory and Big-O stuff has sometimes proved useful (especially useful for things like RDBMS design). The halting problem and automata/Turing Machine theory? Never.

I know this is old, but my short reply to those who claim that theory is useless and that they can practice their profession without it is this:

Without the underlying theory, there is no practice.

Why is theory useful?

Theory is the underlying foundation on top of which other things are built. When theory is applied, practice is the result.

Consider computers today. The common computer today is modeled and built on top of the Turing Machine, which, to keep it simple, is an abstract/theoretical model for computation. This theoretical model lies at the foundation of computing, and all the computing devices we use today, from high-end servers to pocket phones, work because the underlying foundation is sound.

Consider algorithm analysis. In simple terms, algorithm analysis and time-complexity theory have been used to classify problems (e.g. P, NP, EXP, etc) as well as how the algorithms we have behave when trying to solve different problems in different classes.

Suppose one of your friends gets a job at some place X and, while there, a manager makes a few simple requests, such as these examples:

Ex 1: We have a large fleet of delivery vehicles that visit different cities across several states. We need you to implement a system to figure out what the shortest route for each vehicle is and choose the optimal one out of all the possibilities. Can you do it?

Thinking the theory is useless your friends don t realize that they ve just been given the Traveling Salesman Problem (TSP) and start designing this system without a second thought, only to discover their naive attempt to check all the possibilities, as originally requested, is so slow their system is unusable for any practical purposes.

In fact, they have no idea why the system works at an "acceptable" level when checking 5 cities, yet becomes very slow at 10 cities, and just freezes when going up to only 40 cities. They reason that it s only "2x and 8x more cities than the 5 city test" and wonder why the program does not simply require "2x and 8x more time" respectively...

Understanding the theory would ve allowed them to realize the following, at least at a glance:

  1. It s the TSP
  2. The TSP is NP-hard
  3. Their algorithm s order of growth is O(n!)

The numbers speak for themselves:

+--------------+-------+-----------------------------------------------------------------+
|  No. Cities  | O(N!) |  Possibilities                                                  |
+--------------+-------+-----------------------------------------------------------------+
|       5      |   5!  | 120                                                             |
|      10      |  10!  | 3,628,800                                                       |
|      40      |  40!  | 815,915,283,247,897,734,345,611,269,596,115,894,272,000,000,000 |  <-- GG
+--------------+-------+-----------------------------------------------------------------+

They could ve realized at the outset that their system was not going to work as they imagined it would. The system was later considered impractical and cancelled after a significant amount of time, effort, and other resources had been allocated to, and ultimately wasted on, the project --and all because thought "theory is useless".

So after this failure, the managers think "Well, maybe that system was underestimated; after all, there re a LOT of cities in our country and our computers are simply not as fast as we need them to be for our recently cancelled system to have been a success".

The management team blames slow computers as the cause of the project s failure. After all, they re not experts in CS theory, don t need to be, and those who re supposed to be the experts on the topic and could ve informed them, didn t.

But they have another project in mind. A simpler one actually. They come the week later and ask say the following:

Ex 2: We have only a few servers and we have programmers who keep submitting programs that, due to unknown reasons, end up in infinite cycles and hogging down the servers. We need you to write a program that will process the code being submitted and detect whether the submitted program will cause an infinite cycle during its run or not, and decide whether the submitted program should be allowed to run on this basis. Can you do it?

Your dear friend accepts the challenge again and goes to work immediately. After several weeks of work, there re no results, your friend is stressed, and doesn t know what to do. Yet another failure... your friend now feels "dumb" for not having been able to solve this "simple problem"... after all, the request itself made it sound simple.

Unfortunately, your friend, while insisting that "theory is useless" didn t realize that the, allegedly simple, request was actually an intractable problem about decidability (i.e. the halting problem itself), and that there was no known solution for it. It was an impossible task.

Therefore, even starting work to solve that particular problem was an avoidable and preventable mistake. Had the theoretical framework to understand what was being requested been in place, they could ve just proposed a different, and achievable, solution... such as implementing a monitoring process that can simply kill -SIGTERM <id> of any user process (as per a list of users) that monopolizes the CPU for some arbitrary/reasonable interval under certain assumptions (e.g. we know every program run should ve terminated within 10 minutes, so any instance running for 20+ minutes should be killed).

In conclusion, practice without the theory is like a building without a foundation. Sooner or later, the right amount of pressure from the right angle will make it collapse in on itself. No exceptions.

Do you ever use it in your day-to-day coding?

Yes, but not directly. Rather, we rely on it indirectly. The caveat here is that different theoretical concepts will be more or less applicable depending on the problem domain you happen to be working on.

Surely, we:

  1. use computers daily, which relies on computational models (e.g. turing machines)
  2. write code, which relies on computability theory (e.g. what s even computable) and lambda calculus (e.g. for programming languages)
  3. rely on color theory and models (e.g. RGB and CMYK color models) for color displays and printing, etc.
  4. Euler s theorems in computer graphics so that matrices can be built to rotate objects about arbitrary axes, and so on...

It s a fact that someone who simply use a plane to travel doesn t need to understand the theory that even allowed planes to be built and fly in the first place... but when someone is expected to build said machines and make them work... can you really expect a good outcome from someone who doesn t understand even the principles of flight?

Was it really a coincidence that, for most of history, no one was able to build a flying machine (and a few even died testing theirs) until the Wright brothers understood certain theoretical concepts about flight and managed to put them into practice?

It s no coincidence. We have a lot of working technology today because the people who built them understood, and applied, the theoretical principles that allowed them to work in the first place.

I guess it depends on which field you go into.





相关问题
热门标签