Group delay using frequency translation on E5071B

I am using the frequency translation feature on an E5071B to obtain a downconverter phase and group delay. When I use it to measure filters, etc., I get good/consistent phase and group delay numbers.  But when I use it to measure the downconverters, I get inconsistent results.  I.e. the phase (and group delay) numbers change when I change the frequency span.

For example, I have a down converter RF tuned to 200MHz which has an IF output centered at 70MHz (inverted spectrum).  I setup the 5071B to display a bandwidth of 50MHz with 1601 points, and have set the 5071B IFBW to as low as 1KHz.  When the span is set to 50MHz (Port1 = 175-225MHz, Port=95-45MHz, LO=270MHz), I get a group delay what I expect (about 2.5us).  However, when I zoom the span to 20MHz, the group delay reports about 3.4us.  At 10Mhz I'm getting 4.5us.  And so on....

My questions are:
Is the phase(and gd) reliable?  I am sufficiently oversampling enough in the frequency domain so that I'm not frequency aliasing.

Are there pre-conditions I need to do before getting a good group delay measurment in this mode? Group delay is calculated as -d(phi)/d(f), does the 5071B computer a different way? I heard or read once that for a group delay measurment that the span should include DC (or 300KHz in this case of the 5071B). Is this true?