Cameron Fletcher

Random thoughts and dicussions on the things that interest me

Sentence and Word Analysis #2

This as part two of my post on sentence and word analysis. In part one I discussed my motives for analysing the RSS feed in question. In this post I shall be building upon my initial findings and presenting the C# and SQL code that I used to do so.

I have continued to run the RSS reader periodically and now have 284 job descriptions to analyse. I have run through the initial results and identified the words and sentences that are irrelevant and placed these into a keywords table so that I may strip them from my results. This was quite a lengthy process as there were a significant number of these to exclude - nearly a thousand. Following that, I looked through the results and because of the different permutations of the keywords that I was looking for it was evident that I would need to look within the top 100 words/phrases to identify the ones that I was interested in. I made a decision to leave in keywords that related to job skills in addition to computer languages.

The top 100 keyword/skills results from analysis of 284 job descriptions. The analysis took 9.5 minutes to run.

#  Word Rank   #  Word Rank   #  Word Rank
1   C# 301   35   CSS 29   69   structured 16
2   SQL 206   36   E-commerce 29   70   Unix 16
3   .NET 203   37   ASP 26   71   Website 16
4   Server 173   38   C# .NET 26   72   will work 16
5   ASP.NET 129   39   CRM 26   73   automated 15
6   SQL Server 122   40   Equities 26   74   Datawarehouse 15
7   SharePoint 79   41   RAD 26   75   Derivative 15
8   Office 78   42   SQL Server 2005 26   76   desk 15
9   Test 70   43   VBA 25   77   Equity 15
10   C++ 69   44   Winforms 25   78   International 15
11   banking 63   45   C#. 23   79   MOSS 15
12   Java 59   46   C#.NET 23   80   OLAP 15
13   London 59   47   Fixed Income 23   81   VB6 15
14   Front Office 55   48   framework 23   82   ASAP 14
15   XML 52   49   Quant 22   83   Back End 14
16   Windows 47   50  2 21   84   Basic 14
17   Oracle 45   51   Visual Studio 21   85   business req. 14
18   tools 45   52   GUI 20   86   comm. skills 14
19   database 44   53   VB.NET 20   87   document 14
20   Excel 43   54   Web based 20   88   experienced C# 14
21   FX 43   55   Access 19   89   functional 14
22   MS 41   56   Cash 18   90   VB 14
23   HTML 40   57   digital 18   91   .NET Framework 13
24   C# ASP.NET 39   58   Finance 18   92   .NET 3.5 13
25   life cycle 38   59   AJAX 17   93   ASP.NET C# 13
26   C# Developer 36   60   Biztalk 17   94   ASP.Net Developer 13
27   Reporting 36   61   Excel VBA 17   95   C# ASP.net SQL 13
28   analyst 34   62   media 17   96   degree 13
29   JavaScript 33   63   Security 17   97   MS SQL 13
30  3.5 32   64   ASP.Net SQL 16   98   Rates FX 13
31   agile 31   65   CMS 16   99   Reporting Services 13
32   architecture 31   66   credit derivatives 16   100   Siebel 13
33   communication 31   67   Silverlight 16      
34   .NET developer 29   68   Sophis 16      

A link to a backup of the database may be found here: jobs.zip (392.41 kb)
You will need to restore this into SQL Server before the .NET code (below) will work.

A link to the .NET code (C#) is here: RssReader.zip (3.48 kb)
You will need to modify the App.Config file to point to your RSS feed and database.

To run the analysis on the sentances in the database you'll need to execute the 'analyse' stored procedure. Once that has finished execuiting you'll need to perform a select from the 'analysis_results' view to view the results.

Posted: Jun 09 2009, 14:21 by flet0496 | Comments (1) RSS comment feed |
  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Filed under: .NET

Comments

Comments are closed