StorSimple 5000/7000 Series: New Software Upgrade Patch v2.1.1.478 is now available

Software Patch v2.1.1.478 is now available to download for StorSimple 5000/7000 series at site. ftp://ftp.xyratex.com/pub/storsimple/Patch7

Key Points:

FAQs:

When should customer do Firmware Upgrade?

Any customer running appliance software >= v2.1.1.325 and <= 2.1.1.454 should do a Firmware upgrade

Should appliance running software >= 2.1.1.455 need firmware upgrade?

Ideally any appliance running software >= 2.1.1.455 do not need firmware upgrade, they should already be on latest Firmware.

Will the appliance run into any issues if Firmware upgrade patch storsimple-2.1.1.461-B47-firmware-bundle-install-patch.tar is run on the appliance that already has the latest firmware?

No, there are no issues if firmware upgrade patch is run on the appliance which already has latest firmware. The firmware upgrade will complete faster since it compares and skips if the latest one is found.

 

Big Savings on Microsoft Competency Exams

Build your product knowledge and capabilities by earning the certifications and accreditations that help you and your company stand out.

By purchasing a Microsoft competency exam pack, you take a step toward meeting your organisation’s Microsoft gold or silver competency requirements.

Provide exam vouchers to individuals within your organisation to prepare them for their Microsoft Certification exams and to achieve your organisation’s competency requirements.

Whether you’re looking to meet new competency requirements or to simply highlight your team’s world-class skills, this is a great way to get started.

Don’t delay, the promotion ends December 31, 2014, and the vouchers expire May 31, 2015.

 

image

image

How to order competency exam packs:

  1. Determine which pack you want: one, five or ten (remember that these vouchers are only good until May 31, 2015.
  2. Go to the Pearson VUE voucher site and click one of the PDF links to open the online voucher order form for your region.
  3. Fill in the order form. It is helpful to know:
    • Your Microsoft partner number
    • The currency you will be using to pay
    • Method of payment (credit card, PO, etc.)
  4. Click the “Submit” button to send your voucher order directly to Pearson VUE.
  5. Vouchers will be issued within two business days from the time of submission.

Learn to use the PowerShell cmdlet for testing the Microsoft Distributed Transaction Coordinator (MSDTC)

The following article assume the following:

        1. You have no firewall between the server and client or it has been disabled or configured to allow for DTC traffic to occur.

        2. The authentication scheme for DTC is supported by both server and client.

        3. The tester is running as system administrator with elevated permissions.

        4. You have permissions to execute PowerShell commands. (set-executionPolicy)

 

* You may use the MMC for Component Services to view the Transaction Statistics during this exercise. Load Component Services from the Administrative Tools and navigate to:

        [MMCComponent ServicesComputersMy Computer Distributed Transaction CoordinatorLocal DTCTransaction Statistics]

How to perform a local transaction

                1. Open C:WindowsSystem32WindowsPowerShellv1.0powershell.exe as Administrator

                2. Run the following command:

Start-DtcDiagnosticResourceManager

                3. Now create a variable to hold a DtcDiagnosticTransaction by running the following command: 

                $transObject = New-DtcDiagnosticTransaction

                4. To send the transaction, run the following command:

                Send-DtcDiagnosticTransaction–Transaction $transObject

                (Notice the Transaction Statistics now show an active transaction)                 

                5. Now complete the transaction by issuing this command:

Complete-DtcDiagnosticTransaction $transObject

                (Notice the transaction should be committed)                 

                6. If you are finished testing, you can stop the Resource Manager by issuing this command:

Stop-DtcDiagnosticResourceManager -Name “DtcDiagRM:3002″

               

How to perform a distributed transaction

For our example, I’ve named the two machines Server1x64 and Client1x64 to act as participants in a Distributed Transaction.    

        Perform these steps on the Server1x64 

1. Open C:WindowsSystem32WindowsPowerShellv1.0powershell.exe as Administrator               
2. Run the following command:

Start-DtcDiagnosticResourceManager    

        Perform these steps on the Client1x64

1. Create a variable to hold a DtcDiagnosticTransaction by running the following command: 

$transObject = New-DtcDiagnosticTransaction

               
2. To send the transaction, run the following command:

Send-DtcDiagnosticTransaction –ComputerName Server1x64

               
3. Now complete the transaction by issuing this command:

Complete-DtcDiagnosticTransaction $transObject 

                       

Cleanup: Perform these steps on the Server1x64

If you are finished testing, you can stop the Resource Manager by issuing this command:

Stop-DtcDiagnosticResourceManager -Name “DtcDiagRM:3002″

References:

 

Using the Set-ExecutionPolicy Cmdlet

http://technet.microsoft.com/en-us/library/ee176961.aspx

 

Configuring Microsoft Distributed Transaction Coordinator (DTC) to work through a firewall

http://support.microsoft.com/kb/250367

 

Microsoft.Dtc.PowerShell.Diagnostics

http://msdn.microsoft.com/en-us/library/windows/desktop/hh438328(v=VS.85).aspx

       

       

                       

CODESS Events in Mountain View, California and London, UK

I’ll be presenting and helping at this event for Women in Tech on Oct 11th, 2014. Feel free to register if you’re in Nor Cal. Event takes place at Microsoft in Mountain View.

http://codesssvhackathon.challengepost.com/

We have similar event taking place at same day in London in case you’re based near there.

http://codesshackathonlondon.challengepost.com/

 

Simple REST api for Executing PowerShell Scripts

There may be scenarios where you don’t need the full interactive experience of PowerShell remoting over PSRP  (PowerShell Remoting Protocol), or have the need to execute some PowerShell scripts from a non-Windows system.  You could directly code against WS-Man as I’ve blogged about a long time ago, however, it’s quite complicated requiring knowledge of SOAP and WS-Man.  In cases where you want to simply invoke a PowerShell script remotely, a REST api is a good choice since all modern programming languages make it simple to perform a HTTP GET operation.

I wrote a simple HTTP Listener in PowerShell script that uses the .Net HttpListener class.  You simply start an instance of the listener (requires an elevated prompt to listen on the network) and now you can execute an arbitrary PowerShell command-line and get back the results in a variety of formats.

Once you have the module installed (just copy the HttpListener folder to your DocumentsWindowsPowerShellModules folder) and imported, you can start an instance by using the Start-HTTPListener commandlet:

 image

By default, I have it set to listen on port 8888 and using Integrated Windows Authentication, but this is all configurable as parameters to the commandlet.  Here, I have another PowerShell window open and I execute “get-process powershell”:

image

Note that I used the –UseDefaultCredentials to ensure I authenticate with my current security context otherwise you’ll get back Access Denied since anonymous access is not supported.

By default, I return the output as JSON.  Because the resulting JSON object is large, it’s truncated.  The body of the HTTP response is in the Content property, but it’s a byte array.  I use the Utf8 .Net class to decode it and show a portion of the JSON:

image

I can utilize the ConvertFrom-Json to make it more presentable within PowerShell:

image

Of course, if you’re using a different scripting client such as Python, Perl, or Ruby, then you would use script libraries in those languages to use the JSON version.

If you’re just playing around, you can use the Text formatting to make things more easily readable and resembles what you would get executing the same command line directly in PowerShell:

image

Other formatting options are available and if you call the HTTP endpoint without any parameters, you’ll get the basic usage information back:

image

The code is available on TechNet where I also explain a bit more on how the code actually works.

Steve Lee
Principal Test Lead PowerShell
Windows Server

Peer Coaching – Supporting teachers in outback Western Australia

What is Peer Coaching?

Peer Coaching is a professional learning program that supports integration of 21st Century Skills into programming and learning activities. Through a ‘coaching cycle’ a coach works with a teacher to assess, plan, prepare, implement and reflect on a change in classroom practice that ultimately embeds 21st Century Learning opportunities in activities. Peer Coaches have skills that help them to work with teachers in a collegial manner that supports teachers making decisions and choosing outcomes that best suit their classroom. It is a program that nurtures sustainable change and ICT integration in schools over time.

Developed by Dr Les Foltos, Peer Coaching has been the focus of professional learning in many schools around the globe. In Australia schools from K through to 12 have found success and developed a positive, school outcome focussed professional learning community through peer coaching.

Approximately 60 participants from the Perth metropolitan area and Pilbara region in Western Australia’s North West have recently completed the fourth and final day of the Microsoft accredited ICT Peer Coaching course. The original Microsoft course was modified to include the Western Australian curriculum and the unique online environment Connect to build a relevant model of professional learning focusing on collaboration within schools and across networks of schools.  Designed to help teachers embed the use of 21st Century ICT into daily classroom use for students, the course facilitated analysis, questioning, action and reflection of the ways Connect could be used by students, teachers and parents.

The course was delivered in a blended format by the Department of Education’s E-Schooling officers Jan Loftus and Jeni McAuliffe after working with Microsoft consultant Pip Cleaves.  An ICT Peer Coaching Community was set up in Connect for coaches to seek and provide support, ideas and resources on their journey to becoming an effective coach. The success of this Connect community is now apparent with regular reflective discussions and sharing of resources. This opportunity for engagement in coaching conversations within or across schools has contributed to a shift from teachers being ICT experts to realising the significant value of using careful questioning, active listening and skilful reflection strategies.

Pilbara school coaches built confidence to support teachers to use features in Connect such as sharing student learning experiences with parents online, using digital resources with students and uploading learning activities into Learning Content in Connect Classes

Emma Baker, from Baynton West Primary School in Karratha posted in the ICT Peer Coaching Community,My Teacher Development Coach role has enabled me to work with many teachers across the Pilbara to build capacity in ICT. I have worked with coachees to embed Connect into planning, teaching and assessment, and develop video conferencing skills”.

Participants generally coached two or three other teachers in their school and the stage is now set for this professional learning model to continue to grow.

From here, some of the coaches will deliver the Peer Coaching modules to other teachers to receive a further level of accreditation as ICT Peer Coach Facilitators.

If you are interested in finding out more about Microsoft Peer Coaching in Australia contact Design Learn Empower  or Expanding Learning Horizons 


 

XAML Tricks

 

By no means I am expert in XAML but I learned few tricks while developing Windows Store and Phone Apps.

In this blog post, I am going talk about

  • Design Time Binding
  • Constant width for all items in ListView
  • Modal dialog box
  • Alternative to Semantic Zoom control
  • Preselect items in GridView

Complete source code is here

Design time binding

One of the huge benefits of a Visual Studio is that it displays visual representation of how the XAML page is going to look at run time. But it shows only hard coded text. Dynamic data is loaded only at run time and not available during the design time. Most common option to view/edit XAML for the dynamic data is Run-View-Edit : hit F5, view the page, stop debugging, edit XAML and hit F5 again. This process is time consuming. To view the dynamic data we can use design time binding to a static data. Here are simple steps for design time binding :

  1. Create class that is derived from List of your model
  2. Create a empty XAML file and add data for the above class and your model
  3. Add DataContext to your UI element and bind it to the above XAML file

Here is my model classes

public class DesignTimeModels : List<MyModel> { public DesignTimeModels() { } } public class MyModel { public string Title { get; set; } }

Here is the XAML file with sample dynamic data

<m:DesignTimeModels xmlns:m="clr-namespace:UniversalAppsXAML"> <m:MyModel xmlns:m="clr-namespace:UniversalAppsXAML" Title="ONE" /> <m:MyModel xmlns:m="clr-namespace:UniversalAppsXAML" Title="TWO" /> </m:DesignTimeModels>

Here is the XAML page with UI elements

<GridView Grid.Row="1" Margin="0,15" x:Name="itemGridView1" AutomationProperties.AutomationId="ItemGridView" AutomationProperties.Name="Items In Group" ItemTemplate="{StaticResource ItemTemplateOne}" d:DataContext="{d:DesignData Source=..//UniversalAppsXAML.Shared//DesignTimeData.xaml}" ItemsSource="{Binding }" SelectionMode="None" IsSwipeEnabled="false" /> </Grid>

Here is the output

XAML_DesignTimeBinding

 

Constant width for all items in ListView in Windows Phone

When using ListView with varying list item sizes, the item width keeps changing and the UI doesn’t looks good. We cannot set the width as we don’t know the phone screen size.  In the below screenshot, check the differences between the ListView One (red) and ListView Two (blue)

XAML_Same_size

ListView One is using this below standard template :

<DataTemplate x:Key="ItemTemplateOne"> <StackPanel Margin="15" Background="Red" > <TextBlock Margin="5" Text="{Binding Title}" Style="{StaticResource TitleTextBlockStyle}" TextWrapping="WrapWholeWords" /> </StackPanel> </DataTemplate>

ListView Two is using this modified template :

<DataTemplate x:Key="ItemTemplateTwo"> <StackPanel Margin="15" Background="Blue" > <TextBlock Margin="5" Text="{Binding Title}" Style="{StaticResource TitleTextBlockStyle}" TextWrapping="WrapWholeWords" /> <TextBlock Margin="0,-20,0,0" Text="ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ" Foreground="Blue" Canvas.ZIndex="-1"/> </StackPanel> </DataTemplate>

Modal Dialog Box

When we are processing long running requests we want to inform the user about the progress but we don’t want them to interact with the UI till the processing is done. Common scenarios are during login, refreshing content, talking to remote server etc. In the Windows Phone world, we can use  ContentDialog as shown below :

<ContentDialog x:Name="ProgressDialog" Title="{Binding Title}" Content="{Binding ProgressMessage}" Background="#BC403B" />

The code behind looks like this. Note : we are calling await on ShowAsync()

private async void Button_Click(object sender, RoutedEventArgs e) { ProgressViewModel.model.Init("Modal Dialog"); ProgressDialog.DataContext = ProgressViewModel.model; ProgressDialog.ShowAsync(); await DoWork(); ProgressDialog.Hide(); } private async Task<int> DoWork() { for (int i = 0; i < 10; i++) { ProgressViewModel.model.ProgressMessage = i.ToString(); await Task.Delay(1000); } return 1; }

Output looks like this :

XAML_Modal_dlg

 

For Windows Store App, we have to use Callisto from NuGet (ContentDialog is available only in Window Phone). Here are steps :

  1. Right click on the Windows Store App project and select Manage NuGet Packages
  2. In the search textbox, enter Callisto and install it
  3. In the XAML, add this below code
    <callisto:CustomDialog x:FieldModifier="public" x:Name="ProgressDialog" Title="{Binding Title}" BackButtonVisibility="{Binding ShowBackButton}" Background="#BC403B" > <StackPanel Orientation="Vertical"> <TextBlock Margin="5" Text="{Binding ProgressMessage}" Style="{StaticResource SubheaderTextBlockStyle}" /> </StackPanel> </callisto:CustomDialog>

  4. In the code behind, add this code
    private async void AppBarButton_Click(object sender, RoutedEventArgs e) { ProgressViewModel.model.Init("Modal Dialog"); ProgressDialog.DataContext = ProgressViewModel.model; ProgressDialog.IsOpen = true; int i = await DoWork(); if (i == 0) { ProgressDialog.IsOpen = false; } else { //Show error ProgressDialog.BackButtonVisibility = Windows.UI.Xaml.Visibility.Visible; } } private async Task<int> DoWork() { for (int i = 0; i < 10; i++) { ProgressViewModel.model.ProgressMessage = i.ToString(); await Task.Delay(1000); } return 1; }

The output looks like this

screenshot_09282014_161906

Alternative to Semantic Zoom control

Best sample code for Semantic Zoom control on Windows Phone is GolfLink app. For Windows Store App, click here.

Another option is adding Flyout to the GridView header as shown below

screenshot_09292014_161058

Here is the XAML code

<GridView.GroupStyle> <GroupStyle> <GroupStyle.HeaderTemplate> <DataTemplate> <Grid Margin="0,0,0,2"> <Button <StackPanel Orientation="Horizontal"> <TextBlock Text="{Binding Title}" Tapped="TextBlock_Tapped" > <FlyoutBase.AttachedFlyout> <Flyout> <ListBox SelectionChanged="ListBox_SelectionChanged" ItemsSource="{Binding Source={StaticResource itemsViewSource}}" > <ListBox.ItemTemplate> <DataTemplate> <TextBlock Text="{Binding Title}" /> </DataTemplate> </ListBox.ItemTemplate> </ListBox> </Flyout> </FlyoutBase.AttachedFlyout> </TextBlock> </StackPanel> </Button> </Grid> </DataTemplate> </GroupStyle.HeaderTemplate> </GroupStyle> </GridView.GroupStyle>

Preselect items

To preselect GridView item on page load, you can use SelectedItems.Add() as shown below

private void navigationHelper_LoadState(object sender, LoadStateEventArgs e) { model.Load(); DataContext = model; Random r = new Random((int) DateTime.Now.Ticks); foreach(var item in model.MyData) { if( r.Next(2) > 0 ) itemGridView.SelectedItems.Add(item); } }

The output looks like this

screenshot_09292014_162712

 

Complete source code is here

Creating SSL Certificates for CRM 2013 Test Environment

A while back I posted a blog on creating SSL certificates in Server 2008 for a CRM 2011 environment . I had requests on how to do this for CRM 2013 in a Windows Server 2012 environment. When working on a CRM Test environment there are many scenarios where I need to add SSL to the CRM web site such as testing Claims Authentication. Instead of getting a certificate from a 3 rd party certification authority I will just use IIS to generate my own certificates. This allows me to quickly create certificates…(read more)

Performance Quiz #14: Memory Locality etc. Bonus Round!

[Note: I accidently pasted an x64 result in place of an x86 result. As it happens the point I was trying to make was that they were very similar, which they are, but they aren't identical... corrected.]

Thanks to some careful readers I discovered why the shuffling factors discussed in my previous blog entry were so strange.  The rand() method I was using only returns numbers between 0 and MAX_RAND which is 32767 on my system.  That meant that shuffling became a non-factor as the number of elements in the array increased.

I’ve since switched the code to use this instead:

#include <random> // at the top of the file

    void Shuffle(int m)
    {
        std::mt19937 gen(0);  // repeatable results desired
        std::uniform_int_distribution<> dist(0, _n - 1);

        for (int nn = 0; nn < m; nn++)
        {
            int i = dist(gen);
            int j = dist(gen);

            if (i == j) continue;

            int tmp = _perm[i];
            _perm[i] = _perm[j];
            _perm[j] = tmp;            
        }
    }

With this change shuffling becomes generally effective again and the result is that non-locality dominates the effects, which is really a lot more like what we would expect.  I thought that the effect might top off on very large array sizes but it does not.  The normal perfecting of in order memory continues to give us benefits even at very large array sizes.  Let’s look at the architectural differences once again. 

Pointer implementation with no changes
sizeof(int*)=4, sizeof(T)=12
  shuffle,   0.00,   0.01,   0.10,   0.25,   0.50,   1.00
     1000,   1.99,   1.99,   1.99,   1.99,   1.99,   1.99
     2000,   1.99,   1.99,   1.99,   1.99,   1.99,   1.99
     4000,   1.99,   1.99,   2.20,   2.49,   2.92,   3.20
     8000,   1.85,   1.96,   2.38,   3.13,   3.88,   4.34
    16000,   1.88,   1.96,   2.56,   3.40,   4.59,   4.94
    32000,   1.88,   2.17,   3.52,   6.20,   8.16,  11.24
    64000,   1.88,   2.33,   4.45,   8.13,  11.87,  15.96
   128000,   2.06,   2.27,   5.15,   9.32,  14.81,  18.29
   256000,   1.90,   2.35,   5.90,  10.99,  16.26,  21.39
   512000,   1.92,   2.77,   6.42,  11.59,  17.55,  23.71
  1024000,   2.22,   3.42,   7.96,  14.58,  22.97,  30.00
  2048000,   2.54,   4.68,  17.47,  38.54,  62.76,  90.17
  4096000,   2.50,   5.70,  25.17,  53.21,  89.79, 121.06
  8192000,   2.50,   6.31,  30.36,  63.07, 106.04, 142.09
 16384000,   2.50,   6.68,  32.94,  68.79, 115.24, 156.78
 32768000,   2.55,   7.22,  34.64,  73.71, 123.98, 165.47
 65536000,   2.55,   7.96,  36.49,  73.39, 126.17, 168.88

 OK here we see just how profound the non-locality effects are.  The cost per item goes up to a whopping 168.88 at the bottom of the table for fully shuffled data.  And you can see that even a small amount (1%) of non-locality makes a crushing difference.  You can also clearly see that while the data fits none of this matters a whit.  But even at 4k elements we’re already starting to see the effects.

Now let’s compare the same program run on x64.

Pointer implementation with no changes
sizeof(int*)=8, sizeof(T)=20
  shuffle,   0.00,   0.01,   0.10,   0.25,   0.50,   1.00
     1000,   2.28,   1.99,   1.99,   1.99,   2.28,   1.99
     2000,   2.28,   2.42,   2.56,   2.84,   3.27,   3.27
     4000,   2.28,   5.12,   2.92,   3.84,   4.62,   5.05
     8000,   2.17,   2.28,   3.13,   4.34,   5.62,   5.90
    16000,   2.24,   2.49,   3.95,   7.02,   8.28,  10.97
    32000,   2.25,   2.54,   5.00,   8.98,  13.33,  16.75
    64000,   2.24,   2.91,   6.01,  10.22,  15.69,  19.93
   128000,   2.40,   2.93,   7.06,  11.51,  18.80,  22.25
   256000,   2.45,   2.96,   7.16,  13.97,  19.75,  24.89
   512000,   3.06,   3.49,   8.25,  19.20,  25.23,  28.36
  1024000,   3.59,   5.39,  16.21,  34.18,  60.90,  83.61
  2048000,   3.79,   7.18,  27.84,  58.19,  97.72, 130.46
  4096000,   3.79,   8.18,  34.17,  70.37, 118.09, 153.55
  8192000,   3.78,   8.72,  37.35,  76.35, 128.28, 166.00
 16384000,   3.76,   9.27,  39.17,  81.54, 137.48, 173.84
 32768000,   3.77,  10.06,  40.25,  83.91, 140.59, 178.86
 65536000,   3.74,  10.94,  43.60,  86.91, 142.89, 183.30

OK you can see the increase in size of the data structure significantly makes the unshuffled data worse, just like before.  And the size difference actually dilutes the non-locality somewhat.  It’s merely 49 times worse on x64 rather than 66 times worse like on x86.  But it’s still hella bad.   Note that a 66% data growth in this case is resulting in a 46% performance hit.  So actually x64 is doing worse overall not just because of raw data growth, the larger pointers are causing other inefficiencies.

Now I was going to skip the unaligned stuff thinking it would all be the same but I was wrong about that, these data end up being interesting too.  In the interest of saving space I’m just providing the first few rows of the table and the last few for these 4 experiments.

Pointer implementation with bogus byte in it to force unalignment
sizeof(int*)=4, sizeof(T)=13
  shuffle,   0.00,   0.01,   0.10,   0.25,   0.50,   1.00
     1000,   2.28,   1.99,   2.28,   2.28,   1.99,   2.28
     2000,   1.99,   2.13,   2.13,   2.13,   2.13,   2.13
     4000,   2.06,   2.13,   2.70,   3.06,   3.63,   4.05
 16384000,   2.75,   7.39,  35.18,  74.71, 124.48, 167.20
 32768000,   2.74,   7.83,  37.35,  77.54, 126.54, 170.70
 65536000,   2.69,   8.31,  37.16,  77.20, 129.02, 175.87

OK naturally x86 is somewhat worse with unaligned pointers, but not that much worse. Mostly we can attribute this to the fact that the data structure size is now one byte bigger.  The 2.69 in the leading column is likely a fluke, so I’m going to go with 2.74 as the final value for unshuffled data.  Note that the execution time degraded by about 7.4% and the data grew by about 8.3% so really this is mostly about data growth and nothing else.  The shuffled data grew 4%, so the caching effects seem to dilute the size growth somewhat.

Pointer implementation with bogus byte in it to force unalignment
sizeof(int*)=8, sizeof(T)=21
  shuffle,   0.00,   0.01,   0.10,   0.25,   0.50,   1.00
     1000,   2.28,   2.28,   2.28,   2.28,   2.28,   2.28
     2000,   2.42,   2.42,   2.84,   3.27,   3.70,   3.84
     4000,   2.42,   2.49,   3.34,   4.48,   5.48,   5.97
 16384000,   3.81,   9.58,  40.01,  83.92, 142.32, 181.21
 32768000,   3.96,  10.34,  41.85,  88.28, 144.39, 182.84
 65536000,   3.96,  11.37,  45.78,  92.34, 151.77, 192.66

Here’s where things get interesting though.  On x64 the cost of unshuffled data grew to 3.96ns per item from 3.74.  That’s a 5.8% growth.  And the shuffled data grew to 192 from 183, also about 5%.  That’s pretty interesting because the data also grew 5%, exactly 5% as it turns out, from 20 to 21 bytes.

What happens if we grow it a little more to get perfect alignment?

Pointer implementation with extra padding to ensure alignment
sizeof(int*)=4, sizeof(T)=16
  shuffle,   0.00,   0.01,   0.10,   0.25,   0.50,   1.00
     1000,   1.99,   1.99,   1.99,   1.99,   1.99,   1.99
     2000,   1.99,   1.99,   2.13,   2.13,   2.28,   2.13
     4000,   2.06,   1.99,   2.56,   3.34,   3.98,   4.05
 16384000,   3.04,   7.77,  35.52,  74.26, 125.54, 163.54
 32768000,   3.06,   8.44,  36.86,  77.08, 129.97, 168.43
 65536000,   3.08,   9.26,  38.16,  78.69, 129.42, 171.52

Well on x86, growing it doesn’t help at all.  The size hit brings us to 3.08 and 171.52 on the bottom rows, the first number is worse, increasing from 2.74 so padding certainly didn’t help there.  The second number is somewhat better… ~171 vs. ~175 perhaps alignment is helping us to avoid cache splits but it’s only a 2% win and we paid a lot of space to get it (16 bytes instead of 13 or 23% growth.  What about on x64?)

Here the situation is interesting.   

Pointer implementation with extra padding to ensure alignment
sizeof(int*)=8, sizeof(T)=24
  shuffle,   0.00,   0.01,   0.10,   0.25,   0.50,   1.00
     1000,   1.99,   1.99,   1.99,   2.28,   1.99,   1.99
     2000,   2.13,   2.28,   2.70,   2.99,   3.56,   3.70
     4000,   2.28,   2.28,   2.99,   3.91,   4.48,   4.98
 16384000,   4.30,   9.79,  38.55,  80.31, 133.95, 168.14
 32768000,   4.29,  10.81,  39.99,  83.10, 137.47, 168.60
 65536000,   4.29,  11.45,  44.03,  88.23, 143.56, 176.47

Look at that… things did get worse on unshuffled data due to the size growth, we’re up to 4.29.  That’s pretty much the result I expected, data size is king, especially in ordered data.  However, the last column is suprising.  With shuffled data we’re now at 176.46.  Even though we grew the structure to 24 bytes it’s actually running quite a lot faster than the 196 we got in the worst case.  Again I blame the fact that the random access is subject to cache splits and those are not present if the data is aligned.  In fact even a small amount of randomness is enough to make the aligned datastructures better.  At 10% shuffling we were already ahead despite the fact that the data is bigger.  So bottom line here, the unaligned pointer wasn’t costing us much but creating cache splits definitely does, and we see those in the unshuffled data.  [Note: this is a bit of a stretch because I didn’t actually gather stats on cache splits, I know that the odd size packed data must have them and they seem sufficient to explain the behavior, whilst the in order data will be largely immune, but that isn’t full confirmation, so I could be wrong.)

Finally, looking at using indices instead of pointers.

Standard index based implementation
sizeof(int*)=4, sizeof(T)=12
  shuffle,   0.00,   0.01,   0.10,   0.25,   0.50,   1.00
     1000,   3.70,   3.41,   3.41,   3.41,   3.70,   3.41
     2000,   3.41,   3.56,   3.41,   3.41,   3.41,   3.41
     4000,   3.63,   3.41,   3.63,   3.98,   4.34,   4.62
     8000,   3.45,   3.56,   3.84,   4.62,   5.19,   5.69
    16000,   3.52,   3.56,   4.00,   4.85,   5.67,   6.33
    32000,   3.48,   3.64,   5.12,   7.24,   9.83,  12.07
    64000,   3.48,   3.76,   6.10,   9.52,  13.78,  17.54
   128000,   3.48,   3.87,   6.70,  10.74,  15.90,  20.23
   256000,   3.48,   3.95,   7.34,  11.96,  17.48,  22.48
   512000,   3.46,   4.01,   7.75,  13.03,  19.69,  25.45
  1024000,   3.70,   4.89,  10.27,  16.68,  25.41,  32.78
  2048000,   3.80,   6.03,  19.54,  39.37,  65.81,  89.51
  4096000,   3.80,   7.12,  27.58,  55.90,  94.37, 125.29
  8192000,   3.78,   7.72,  32.42,  65.97, 110.21, 146.96
 16384000,   3.79,   8.07,  35.15,  71.50, 119.32, 159.43
 32768000,   3.63,   8.19,  35.50,  72.99, 121.11, 161.69
 65536000,   3.78,   9.18,  37.63,  76.34, 123.47, 164.85

As before, the x86 code is slower… there is nothing but downsize there since the pointers were already 32 bits.

Standard index based implementation
sizeof(int*)=8, sizeof(T)=12
  shuffle,   0.00,   0.01,   0.10,   0.25,   0.50,   1.00
     1000,   3.41,   3.41,   3.41,   3.70,   3.41,   3.70
     2000,   3.41,   3.56,   3.41,   3.41,   3.41,   3.41
     4000,   3.56,   3.41,   3.63,   4.05,   4.34,   4.55
     8000,   3.41,   3.52,   3.88,   4.59,   5.19,   5.65
    16000,   3.43,   3.61,   4.00,   4.84,   5.76,   6.31
    32000,   3.47,   3.64,   5.08,   7.24,   9.78,  12.20
    64000,   3.48,   3.75,   6.11,   9.52,  13.84,  17.54
   128000,   3.49,   3.86,   6.71,  10.72,  15.86,  20.23
   256000,   3.48,   3.96,   7.34,  11.96,  17.83,  22.74
   512000,   3.45,   4.00,   7.75,  12.80,  19.16,  25.49
  1024000,   3.70,   5.27,  16.56,  16.10,  23.88,  32.46
  2048000,   3.80,   6.10,  19.48,  39.09,  66.05,  89.76
  4096000,   3.80,   6.94,  26.73,  54.67,  91.38, 125.91
  8192000,   3.79,   7.72,  32.45,  66.14, 110.40, 146.90
 16384000,   3.62,   7.98,  35.10,  71.72, 120.31, 159.13
 32768000,   3.77,   8.43,  36.51,  75.20, 124.58, 165.14
 65536000,   3.77,   9.15,  37.52,  76.90, 126.96, 168.40

And we see that the x64 run times very similar.  x64 provides no benefits, but we can use the data reduction provided by 32 bit pointers to save space and net a significant savings in execution time with even modestly (1%) shuffled data, an entirely normal situation.  And the space savings don’t suck either.  The raw speed when comparing to fully ordered data is about a wash.

Just for grins I hacked up an implementation that uses 16 bit integers, here’s what it looks like on x64, keeping in mind that you could only use this if your array sizes were known to be less than 64k (which they often are frankly). Or if you could flexibly swap it in.

Standard  short index based implementation
sizeof(int*)=8, sizeof(T)=6
  shuffle,   0.00,   0.01,   0.10,   0.25,   0.50,   1.00
     1000,   3.98,   3.98,   3.98,   3.98,   3.98,   3.98
     2000,   3.84,   3.98,   3.84,   3.98,   3.84,   3.84
     4000,   3.91,   4.05,   3.91,   3.91,   3.91,   3.91
     8000,   3.95,   3.98,   4.12,   4.37,   4.69,   4.94
    16000,   4.00,   3.98,   4.34,   4.84,   5.49,   6.04
    32000,   3.95,   4.06,   4.46,   5.11,   6.11,   6.28
    64000,   3.93,   4.09,   5.55,   7.05,   9.77,  11.45

What’s interesting here is that  even though it’s smaller the short word fetching on the x64 seems inferior and so at 64k elements the unshuffled cost is higher.  But if the data varies then the space savings kicks in and you get a nice win.  Its sort of like moving yourself one notch down in the table at that point.

Subsystem for UNIX-based Applications is not affected by Shellshock

Greetings, users of Subsystem for UNIX-based Applications (SUA). I’m the technical writer for SUA, which was deprecated in the Windows Server 2012 and Windows 8 timeframe.
This is just a quick post to let you know (because customers have been concerned about SUA and the Shellshock vulnerability in Bash) that SUA is not affected by Shellshock. SUA and its associated SDK do not include Bash, and installing SUA does not install Bash.
For more information about Shellshock, I’ve collected the following…(read more)