Author Topic: Quake 2 Software Rendering with Coloured Lights  (Read 6113 times)

Offline Maraakate

  • New Member
  • Posts: 1
Quake 2 Software Rendering with Coloured Lights
« on: August 18, 2015, 04:14:20 AM »
Hello all,

I'm rather new to using ASM, but have done some small adjustments to the existing codebase here and there.

Recently, someone updated the software renderer in Quake 2 to take advantage of a look up table and add in coloured lighting (previously it just cast only white lights for everything).  Now, I have been able to convert some of the extra code additions.  Most of it is a few lines and a small if/else statement.

However, there is one that would result in a decent speed increase to take advantage of the ASM version again instead of using the C code path.  It's an if/else/else statement that has about 3 extra lines but takes advantage of a CLAMP macro for making sure the variable it is setting doesn't go out of bounds.

Here is the original C function
Code: [Select]
#if !id386
void R_PolysetDrawSpans8_Opaque (spanpackage_t *pspanpackage)
{
int lcount;

do
{
lcount = d_aspancount - pspanpackage->count;

errorterm += erroradjustup;
if (errorterm >= 0)
{
d_aspancount += d_countextrastep;
errorterm -= erroradjustdown;
}
else
{
d_aspancount += ubasestep;
}

if (lcount)
{
int lsfrac, ltfrac;
byte *lpdest;
byte *lptex;
int llight;
int lzi;
short *lpz;

lpdest = pspanpackage->pdest;
lptex = pspanpackage->ptex;
lpz = pspanpackage->pz;
lsfrac = pspanpackage->sfrac;
ltfrac = pspanpackage->tfrac;
llight = pspanpackage->light;
lzi = pspanpackage->zi;

do
{
if ((lzi >> 16) >= *lpz)
{
//PGM
if(r_newrefdef.rdflags & RDF_IRGOGGLES && currententity->flags & RF_IR_VISIBLE)
*lpdest = ((byte *)vid.colormap)[irtable[*lptex]];
else
*lpdest = ((byte *)vid.colormap)[*lptex + (llight & 0xFF00)];
//PGM
*lpz = lzi >> 16;
}
lpdest++;
lzi += r_zistepx;
lpz++;
llight += r_lstepx;
lptex += a_ststepxwhole;
lsfrac += a_sstepxfrac;
lptex += lsfrac >> 16;
lsfrac &= 0xFFFF;
ltfrac += a_tstepxfrac;
if (ltfrac & 0x10000)
{
lptex += r_affinetridesc.skinwidth;
ltfrac &= 0xFFFF;
}
} while (--lcount);
}

pspanpackage++;
} while (pspanpackage->count != -999999);
}
#endif

Here is the update version:

Code: [Select]
#if 1 //qb: no asm colored light support was- !id386
// leilei - colored lighting

void R_PolysetDrawSpans8_Opaque_Coloured(spanpackage_t *pspanpackage)
{

do
{
lcount = d_aspancount - pspanpackage->count;

errorterm += erroradjustup;
if (errorterm >= 0)
{
d_aspancount += d_countextrastep;
errorterm -= erroradjustdown;
}
else
{
d_aspancount += ubasestep;
}

if (lcount)
{
lpdest = pspanpackage->pdest;
lptex = pspanpackage->ptex;
lpz = pspanpackage->pz;
lsfrac = pspanpackage->sfrac;
ltfrac = pspanpackage->tfrac;
llight = pspanpackage->light;
lzi = pspanpackage->zi;

do
{
if ((lzi >> 16) >= *lpz)
{
//PGM
if(r_newrefdef.rdflags & RDF_IRGOGGLES && currententity->flags & RF_IR_VISIBLE)
*lpdest = ((byte *)vid.colormap)[irtable[*lptex]];
// leilei - colored lights begin
else if (coloredlights)
{
int lptemp = *lptex;
pix24 = (unsigned char *)&d_8to24table[lptemp];
//qb: works now...
/* FS: For my sanity... Bit shift pix24 over 15, 0 is low, 63 is max */
trans[0] = CLAMP((int)(pix24[0] * (pspanpackage->lightr * shadelight[0])) >> 15, 0, 63);
trans[1] = CLAMP((int)(pix24[1] * (pspanpackage->lightg * shadelight[1])) >> 15, 0, 63);
trans[2] = CLAMP((int)(pix24[2] * (pspanpackage->lightb * shadelight[2])) >> 15, 0, 63);

*lpdest = palmap2[trans[0]][trans[1]][trans[2]];
} // leilei - colored lights end
else *lpdest = ((byte *)vid.colormap)[*lptex + (llight & 0xFF00)];

//PGM
*lpz = lzi >> 16;
}
lpdest++;
lzi += r_zistepx;
lpz++;
llight += r_lstepx;
lptex += a_ststepxwhole;
lsfrac += a_sstepxfrac;
lptex += lsfrac >> 16;
lsfrac &= 0xFFFF;
ltfrac += a_tstepxfrac;
if (ltfrac & 0x10000)
{
lptex += r_affinetridesc.skinwidth;
ltfrac &= 0xFFFF;
}
} while (--lcount);
}

pspanpackage++;
} while (pspanpackage->count != -999999);
}
#endif

The only addition is the else if (coloredlights) statement and it's code

The original ASM code is at: https://bitbucket.org/neozeed/q2dos/raw/624d7f8c8c7cdf157de51ad9f00a5a9838b4bc24/ref_soft/r_polysa.asm

I already have the proper externs defined for what I will need and the span size's have been adjusted and include new stuff that they need (I have previously adjusted some code already to bring them back to their ASM state)

I know that after the irvisible check you have to do a coloredlights check then jmp somewhere to do it's math that it wants, but I'm not quite sure specifically where to start and if it's going to be a loop like in the ASM code where it is unrolled about 8 times.  The few small pieces of code I have left are things like this where they become deeply unrolled and I have no clue how to start it.

I'd appreciate some clues and good start points as I want to be able to eventually start expanding other functions that were never ASM but could become so later on for more potential performance gains.