Thursday, May 15, 2014

Re: Can you tell me Step-By-Step Guidelines of How to use HtmlUnit to make GWT app Crawlable?

Ok, here is the problem of HTMLUnit. 

I have this code

 url_with_hash_fragment="http://127.0.0.1:8888/Myproject.html?gwt.codesvr=127.0.0.1:9997#!home";
         
// use the headless browser to obtain an HTML snapshot
         
final WebClient webClient = new WebClient();
         
HtmlPage page = webClient.getPage(url_with_hash_fragment);


         
// important!  Give the headless browser enough time to execute JavaScript
         
// The exact time to wait may depend on your application.
         webClient
.waitForBackgroundJavaScript(2000);


         
// return the snapshot
         
PrintWriter out = response.getWriter();
         
out.println(page.asXml());


in my eclipse, it first listed css clode like:

gwt-ToggleButton-down {
  background
-position: 0 -513px;
  border
: 1px inset #ccc;
  cursor
: pointer;
  cursor
: hand;
}
.gwt-ToggleButton-down-hovering {
  background
-position: 0 -513px;
  border
: 1px inset;
  border
-color: #9cf #69e #69e #7af;
  cursor
: pointer;
  cursor
: hand;
}....
: null
java.util.EmptyStackException
at java.util.Stack.peek(Unknown Source)

Why we got EmptyStackException problem?

If I run http://127.0.0.1:8888/Myproject.html?gwt.codesvr=127.0.0.1:9997#!home without HTMLUnit then everything is fine no issue, but if i run that url under HTMLUnit  then I got  EmptyStackException

How can sure 100% that HtmlUnit will never generate any error for any kind of HTML code?

On Friday, May 16, 2014 12:06:24 AM UTC+10, Jens wrote: 
HtmlUnit is bundles as jar file so you can put it (and all its dependencies) into WEB-INF/lib of your war.

Then you need to write a servlet that takes the server request of the Google bot, rewrites the _escaped_fragment_ parameter back to the original #!<token> url and starts HtmlUnit with that url. The resulting/rendered page will then be returned by the servlet.

At the bottom is an example:



The rendered page that you serve the Google Bot does not have to be a 1:1 copy of your original page. It is enough if the same content is available, styling is irrelevant. For example compare:



-- J.

--
You received this message because you are subscribed to the Google Groups "Google Web Toolkit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-web-toolkit+unsubscribe@googlegroups.com.
To post to this group, send email to google-web-toolkit@googlegroups.com.
Visit this group at http://groups.google.com/group/google-web-toolkit.
For more options, visit https://groups.google.com/d/optout.

No comments:

Post a Comment