2015:04:20:go
Sommaire
Go Language
Submit your code
Make sure your code respects the following architecture:
├── AUTHORS ├── server │ └── server.go ├── html │ └── index.html └── README
Submition will be sunday on asm.epita.it (no later than 23h42 and 42 seconds)
Objective of the TP
The aim of this TP is to apply the different notions presented in the Marwan's class. For this, you will implement a website crawler, which will in addition to copy the website will create the website's spanning tree (in dot format).
NOTE: Spanning tree means foret couvrante in french
Subject Maintainer
Feel free to contact me with any question if something is unclear.
- gauber_r (Renaud "Greed" Gaubert) renaud@lse.epita.fr
GO
Why these two languages? After Marwan's class, you shouldn't be surprised to see a TP of go. A theoretical approach is sometimes good, but the best way to learn is hands-on, so that's what we are going to do.
Compiler
For Go we will be using the go compiler (https://golang.org/doc/install), basic usage with the following code is:
package main import "fmt" func main() { fmt.Printf("hello, world\n") }
If you want to run the code without compiling it:
> go run hello.go hello, world
I you want to compile your code:
> go build hello.go > ls hello hello.go
If you set-up correctly your environment, you can directly compile using:
> echo $GOPATH /home/login_x > pwd /home/login_x/godir/src/tp/hello_world > ls hello.go > go build > ls hello_world hello.go
Language
Lectures' slides are available here [1]
The best starting point for Go is obviously http://golang.org and its interactive tour. The Effective Go [2] page is also a good reference.
Finally, you can try some tutorial like http://learnxinyminutes.com/docs/go/ or https://gobyexample.com/
A small example: fibonacci
func fibo(n int) int { if n < 0 { return 0 } // Single line comment if n <= 1 { return n } /** * Multi- * line comment */ return fibo(n - 1) + fibo(n - 2) }
Multiple return and Misc
func main() { var x int // Variable declaration. Variables must be declared before use. x = 3 // Variable assignment. y := 4 // "Short" declarations use := to infer the type, declare, and assign. sum, prod := returnMultipleValues(x, y) // Function can return multiple value } func returnMultipleValues(x, y int) (sum, prod int) { return x + y, x * y // Return two values. }
Debugging
GDB does not understand Go programs well. The stack management, threading, and runtime contain aspects that differ enough from the execution model GDB expects that they can confuse the debugger, even when the program is compiled with gccgo.
As a consequence, although GDB can be useful in some situations, it is not a reliable debugger for Go programs, particularly heavily concurrent ones. Moreover, it is not a priority for the Go project to address these issues, which are difficult. In short, the instructions below should be taken only as a guide to how to use GDB when it works, not as a guarantee of success.
Introduction
When you compile and link your Go programs with the gc toolchain on Linux the resulting binaries contain DWARFv3 debugging information that recent versions (>7.1) of the GDB debugger can use to inspect a live process or a core dump.
The code generated by the gc compiler includes inlining of function invocations and registerization of variables. These optimizations can sometimes make debugging with gdb harder. To disable them when debugging, pass the flags -gcflags "-N -l" to the go command used to build the code being debugged.
Common Operations
Show file and line number for code, set breakpoints and disassemble:
(gdb) list (gdb) list line (gdb) list file.go:line (gdb) break line (gdb) break file.go:line (gdb) disas
Show backtraces and unwind stack frames:
(gdb) bt (gdb) frame n
Show the name, type and location on the stack frame of local
variables, arguments and return values:
(gdb) info locals (gdb) info args (gdb) p variable (gdb) whatis variable
Show the name, type and location of global variables:
(gdb) info variables regexp
Pretty printing a string, slice, map, channel or interface:
(gdb) p var
A $len() and $cap() function for strings, slices and maps:
(gdb) p $len(var)
That's it now you should be able to find the necessary documentation to learn by yourself how to use go.
TP
Step 1: Get A Webpage
For this step you are asked to complete the function get_web_page which takes a string and returns the html page (the body of the request):
func get_web_page(url string) string
Step 2: Find All The Webpage's URLS
Write a function which will take the html and return a slice of links :
func get_urls(html string) []string
HINT : https://godoc.org/golang.org/x/net/html is one of the possibilities
Step 3: Explore The URLS
Now you only need to connect the two functions you just wrote and make sure you don't go exploring the internet as well as loop infinitely...
So make sure the urls you explore each time are still in the same domain and that you are not exploring an already known url.
HINT : This is similar to graph exploring so don't hesitate to mark the links (understand use a map/hashmap)
The function you need to write is the following:
func explorer(html base_url) []string
NOTE: You could use recursion (it might be easier), you could also use a simple while loop and use the array as a stack or a queue.
Step 3: Spanning Tree And Http Server
Since what you are doing is basically graph exploration your goal here will be to launch an Http server on the port 8080 and serve:
- A basic web page which contains a form (only one) with a text field asking for the URL of the website to crawl.
Once the user submits the URL you must crawl the website, and display a page with the spanning tree (foret couvrante) generated using the dot command.
The function you need to write is the following:
func start_server()
HINT: - Execute External: https://golang.org/pkg/os/exec/ - Http Server: http://golang.org/pkg/net/http/ : - HandleFunc - ListenAndServe - PostFormValue - ResponseWriter - Request
Good Luck !