I’ve been sitting on a few ideas for Help Builder’s HTML output and today I finally got around to implementing a few of these new features. Help Builder generates its output to HTML and then allows you to quickly and efficiently upload the content up to a Web server via FTP so you can get your content online immediately in a few minutes.
Cool. But of course as soon as you provide something that looks like something else – a CHM file – people ask, where’s the Search capability? Well, see here, the CHM can search internally because it’s an engine, the HTML pages on the Web site are just plain dumb HTML pages – they know nothing about searching and indexes and don’t have proper rights to do anything useful with script code.
Alright, that was then – this is now. <g> So, I decided to something simple about this. Something really simple. Which is to build a generic ASP.NET page – Seach.aspx - that can be plunked on an ASP.NET enabled server in the HTML directory and can be used for searching.
You can see this basic functionality here:
http://www.west-wind.com/westwindwebstore/docs/
If you click on the Search link you’re whisked to an ASPX search page that performs a very simplistic search task (which even though simplistic isn’t a whole lot worse than the default CHM search functionality in performance <g>).
How’s it work? Well, Help Builder now has a new option on the Build menu that allows you to add that Search button to the table of contents. When you check this option, HB copies Search.aspx and adds the Search link to the table of contents – if and only if running through HTTP. You need to enable this option, and you are responsible for making sure ASP.NET is enabled in the directory (on my server I have ASP.NET enabled at the root directory and any of the ‘regular’ product directories automatically inherit the ASP.NET configuration so simply copying the file is all that’s needed). In addition you also need rights for the ASP.NET account (ASPNET/NETWORK SERVICE) to have read access to the data.
Yup, read access, because as I said this is a very simplistic and not too scalable approach to searching. Search.aspx basically takes an input search string and opens each of the Html files in the directory and looks for that string. Currently the search is using a plain indexOf but when I get a little more time a better parser will be added.
Now this is obviously a very inefficient way to search, but even on fairly large help files (this one has almost 1300 topics) the results return in about a second or less – well after the first search. The first search is very slow – it can take 5-6 seconds, but after the server has accessed each of the files once the next searches return quickly. Help files don’t tend to be highly frequented areas of a Web site so scalability is not likely to be a big problem – and unless help files are in the thousands of entries performance should be more than adequate.
Here’s what Search.aspx looks like:
<%@ Page language="c#" %>
<%@import namespace="System.IO"%>
<%@import namespace="System.Text"%>
<%@import namespace="System.Text.RegularExpressions"%>
<script runat="server" language="C#">
protected string PageTitle = "Test Project 5";
protected string ResultHtml = "";
protected int MatchCount = 0;
private void Page_Load(object sender, System.EventArgs e)
{
string Title = Request.QueryString["Title"];
if (Title != null && Title != "")
this.PageTitle = Title;
if (this.IsPostBack && Request.Form["btnSearch"] == null)
this.btnSearch_Click(Page,EventArgs.Empty);
}
private void btnSearch_Click(object sender, System.EventArgs e)
{
string Path = Request.PhysicalPath;
Path = new FileInfo(Path).DirectoryName;
string Search = this.txtSearch.Text;
if (Search == "")
{
this.ResultHtml = "Please enter a search expression.";
return;
}
// Put user code to initialize the page here
string[] FileList = Directory.GetFiles(Path,"_*.htm");
StringBuilder sb = new StringBuilder();
foreach(string Filename in FileList)
{
FileInfo fi = new FileInfo(Filename);
string Title = "";
string Image;
if ( SearchFile(Filename,Search,out Title,out Image) )
{
sb.Append("<img src='bmp/" + Image + "'> <a href='" + fi.Name + "'>" +
Title + "</a><br>");
this.MatchCount++;
}
}
if (this.MatchCount == 0)
this.ResultHtml = "No matching topics found.";
else
{
this.ResultHtml = "<small> " +
MatchCount.ToString() + " topics found<p>";
this.ResultHtml += sb.ToString();
}
}
bool SearchFile(string File, string Search,out string Title,out string Image)
{
Title = "";
Image = "topic.gif";
Search = Search.ToLower();
StreamReader sr = new StreamReader(File);
string Content = sr.ReadToEnd();
sr.Close();
if (Content.ToLower().IndexOf(Search) > -1)
{
Title = ExtractString(Content,"<title>","</title>",false);
Image = ExtractString(Content,"\r\n<img src=\"bmp/",
"\">",false);
if (Image == null || Image == "" || Image.Length > 25)
Image = "topic.gif";
return true;
}
System.Threading.Thread.Sleep(0); // && Force to give up time slice
return false;
}
protected static string ExtractString(string Source, string BeginDelim,
string EndDelim, bool CaseSensitive)
{
int At1, At2;
if (CaseSensitive)
{
At1 = Source.IndexOf(BeginDelim);
At2 = Source.IndexOf(EndDelim,At1+ BeginDelim.Length );
}
else
{
string Lower = Source.ToLower();
At1 =Lower.IndexOf( BeginDelim.ToLower() );
At2 = Lower.IndexOf( EndDelim.ToLower(),At1+ BeginDelim.Length);
}
if (At1 > -1 && At2 > 1)
{
return Source.Substring(At1 + BeginDelim.Length,At2-At1 - BeginDelim.Length);
}
return "";
}
</script>
<HTML>
<HEAD>
<title>
<%= this.PageTitle %>
</title>
<base target="wwhelp_right">
<LINK rel="stylesheet" type="text/css" href="templates/wwhelp.css">
<style>
INPUT { FONT-SIZE: 8pt }
</style>
</HEAD>
<body topmargin="0" leftmargin="0" style="BACKGROUND:white">
<table class="tocbody" width="800">
<tr>
<td class="banner" HEIGHT="25" VALIGN="middle"> <b style="font-size:10pt"><%= this.PageTitle %></b>
<div style="font-size:8pt;margin-top:5pt;margin-bottom:3pt"> <a href="index2.htm" target="wwhelp_left">Table of Contents</a></div>
</td>
</tr>
</table>
<form id="Form1" method="post" runat="server" target="wwhelp_left">
<div style="MARGIN-LEFT:5px;width:800px;" class="tocbody">
Search for:
<asp:TextBox id="txtSearch" runat="server" Width="136px"></asp:TextBox>
<asp:Button id="btnSearch" runat="server" Text="Search" OnClick="btnSearch_Click"></asp:Button><BR>
<hr>
<%= this.ResultHtml %>
</div>
</form>
</body>
</HTML>
What’s nice about this approach is that it’s fully self contained. You can simply drop this into a directory (assuming ASP.NET is installed) and it works. No database, no special index – nothing. Just one simple ASPX page. Because it’s a single page it’s also easy to modify the logic or search parameters if necessary and allow the users to customize the display if they choose.
There are a few open issues with this implementation – the searching is overly simplistic at the moment looking for an exact match. Using basic Boolean and string encapsulation of terms would be nice to support at least.
The other issue is that this searches everything including the HTML so if you’re looking for something like a TD tag you’re going to return every topic in the help file <g>. I think I can live with that though. Along the same lines things that are expanded with entities (ie. & or ©) aren’t going to be found.
Currently there’s no reliable way for Help Builder to determine what type of topic it’s dealing with because the existing templates do not have anything in them that identifies the topic uniquely. Currently the code searches for a specific image tag format which doesn’t work with most existing help files, but the latest templates of 4.05+. I’ll add a custom tag to the new templates that will take care of this in the future, but this is a forward compatible thing.The topic type is useful so when you search you can see the topic type icon next to the topic text which makes it easier to identify.